arules | 易学教程

R-convert transaction format dataset to basket format for sequence mining

阅读更多关于 R-convert transaction format dataset to basket format for sequence mining

问题 ORIGINAL TABLE CELL NUMBER ----------ACTIVITY--------TIME 001................................call a................12.23 002................................call b................01.00 002................................call d................01.09 001................................call b................12.25 003................................call a................12.23 002................................call a................02.07 003.......................

matching transaction with %in% in arules package R

阅读更多关于 matching transaction with %in% in arules package R

问题 I need to find transactions matching some rules. The following code used to work, but now R recognise %in% from the base package instead from arules . matchRules=function(rules,transactions){ id.match=which(transactions %in% rules) matchedTrx=transactions[id.match] summary(matchedTrx) return(matchedTrx) } I tried arules::%in% but it doesn't work. If I use: id.match=which(transactions arules::%in% rules) I get Error: unexpected symbol in "id.match=which(transactions arules" Thanks for your

arules: How find the data matching an lhs(rule) in R or an SQL WHERE clause?

阅读更多关于 arules: How find the data matching an lhs(rule) in R or an SQL WHERE clause?

问题 I'm finding working with the arule package a bit tricky. I'm using the apriori algorithm to find association rules; something similar to an example in the arules documentation. data("AdultUCI") dim(AdultUCI) AdultUCI[1:2,] #Ignore everything from here to the last two lines, this is just data preparation ## remove attributes AdultUCI[["fnlwgt"]] <- NULL AdultUCI[["education-num"]] <- NULL ## map metric attributes AdultUCI[[ "age"]] <- ordered(cut(AdultUCI[[ "age"]], c(15,25,45,65,100)), labels

Creating specific rules with arules in r

阅读更多关于 Creating specific rules with arules in r

问题 I have a large data set (matrix of 0s and 1s) with 200 variables(each variable is an item) and almost 1M rows (each row is a transaction). I use "arules" package in R for association rule mining. I considered 2 items and I want to create all the rules that have at least one of them at the left hand side of the rule. The code that I wrote is: rules <- apriori(data, parameter = list(support = 0.1, confidence = 0.1, minlen =2),appearance = list(lhs=c("itemA=1","itemB=1"),default="rhs")) But this

R arules, mine only rules from specific column

阅读更多关于 R arules, mine only rules from specific column

问题 I would like to mine specific rhs rules. There is an example in the documentation which demonstrates that this is possible, but only for a specific case (as we see below). First an data set to illustrate my problem: input <- matrix( c( rep(10001,6) , rep(10002,3) , rep(10003,3), 100001,100002,100003,100004,100005,100006,100002,100003,100007,100002,100003,100008,rep('a',6),rep('b',6)), ncol=3) colnames(input) <- c(letters[1:3]) input <- as.data.frame(input) Now i can create rules: r <- apriori

Main sequences from Arules Sequence Mining in R

阅读更多关于 Main sequences from Arules Sequence Mining in R

问题 How to remove the sub-sequences from cspade algorithm in arulesSequence package in R, For example if my data(Sample.txt) is as below Column Names: sequenceID, EventID, size, Item 1 1 1 A 1 2 1 B 1 3 1 C 1 4 1 D 2 1 1 A 2 2 1 B 2 3 1 C 3 1 1 A 3 2 1 B 3 3 1 C 3 4 1 D After running the below arulesSequence line of codes library("arulesSequences") #### while importing the Sample.txt remove the column names ##### SymptomArulesSeq <- read_baskets("Sample.txt",sep = "[ \t]+",info = c("sequenceID",

Converting from regular format to sparse format for arules package

阅读更多关于 Converting from regular format to sparse format for arules package

问题 I am trying to convert my regular data set to sparse format. All documentations have examples with 'sparse format' Can you help me please? My sample data set: ID Item 1 Avas 2 Alo 2 Erbi 8 Abra 8 Ali 9 Inj 10 Avas 11 Avas 回答1: Convert to transaction class: trans1 <- as(split(df1[,"Item"], df1[,"ID"]), "transactions") Result: summary(trans1) # transactions as itemMatrix in sparse format with # 6 rows (elements/itemsets/transactions) and # 6 columns (items) and a density of 0.2222222 # # most

R arulesSequences Find which patterns are supported by a sequence

阅读更多关于 R arulesSequences Find which patterns are supported by a sequence

问题 I'm having troubles with the arulesSequences library in R I have a transactional dataset with temporal information (here, let's use the default zaki dataset). I use SPADE ( cspade function) to find the frequent subsequences in the dataset. library(arulesSequences) data(zaki) frequent_sequences <- cspade(zaki, parameter=list(support=0.5)) Now, what I want is to find, for each sequence (i.e. for each custumer) which are the frequent subsequences that it supports. I tried various combinations of

Arules Sequence Mining in R

阅读更多关于 Arules Sequence Mining in R

问题 I am looking to use the arulesSequences package in R. However, I have no idea as to how to coerce my data frame into an object that can leverage this package. Here is a toy dataset that replicates my data structure: ids <- c(rep("X", 5), rep("Y", 5), rep("Z", 5)) seq <- rep(1:5,3) val <- sample(LETTERS, 15, replace=T) df <- data.frame(ids, seq, val) df ids seq val 1 X 1 T 2 X 2 H 3 X 3 V 4 X 4 A 5 X 5 X 6 Y 1 D 7 Y 2 B 8 Y 3 A 9 Y 4 D 10 Y 5 P 11 Z 1 Q 12 Z 2 R 13 Z 3 W 14 Z 4 W 15 Z 5 P Any

Transform csv into transactions for arules [duplicate]

阅读更多关于 Transform csv into transactions for arules [duplicate]

问题 This question already has answers here : How to prep transaction data into basket for arules (2 answers) Closed 3 years ago . I have a subset from a database in csv which has several different columns and I would like to convert the data into transactions. I've already read this post library(arules) library(arulesViz) trans = read.transactions("data.csv", format = "single", sep = ",", cols = c("EMAIL", "BRAND")) However wasn't able to convert my data with the proposed solution: CATEGORY BRAND