how to convert data.frame to transactions for arules

我的未来我决定 提交于 2019-11-26 10:35:08

问题


I read data from a csv file, the data has 3 columns, one is transaction id, the other two are product and product catagory. I need to convert this into transactions in order to use the apriori function in arules. It shows an error when I convert to transactions:

dat <- read.csv(\"spss.csv\",head=TRUE,sep=\",\" , as.is = T)
dat[,2] <- factor(dat[,2])
dat[,3] <- factor(dat[,3])
spssdat <- dat[,c(1,2,3)]
str(spssdat)

\'data.frame\':   108919 obs. of  3 variables:
 $ Transaction_id: int  3000312 3000312 3001972 3003361 3003361 3003361 3003361 3003361 3003361 3004637 ...
 $ product_catalog : Factor w/ 9 levels \"AIM\",\"BA\",\"IM\",..: 1 1 5 7 7 7 7 7 7 1 ...
 $ product      : Factor w/ 332 levels \"ACM\",\"ACTG/AIM\",..: 7 7 159 61 61 61 61 61 61 7 ...

trans4 <- as(spssdat, \"transactions\")

Error in as(spssdat, \"transactions\") : 
  no method or default for coercing “data.frame” to “transactions”

If the data only have two columns, it can work by:

trans4 <- as(split(spssdat[,2], spssdat[,1]), \"transactions\")

But I don\'t know how to convert when I have 3 columns. Usually there are the additional columns likes category attributes, customer attributes. so the column usually large than 2 columns. need to find rules between multiple columns.


回答1:


I have found some information that worked for me on this website. Let me copy relevant paragraph:

The dataframe can be in either a normalized (single) form or a flat file (basket) form.
When the file is in basket form it means that each record represents a transaction where the items in the basket are represented by columns.
When the dataset is in single form it means that each record represents one single item and each item contains a transaction id.

To load transactions from file, use read.transactions. In both your and my case file is in the single form.
I've used following code to load .csv file as transactions:

trans = read.transactions("some_data.csv", format = "single", sep = ",", cols = c("transactionID", "productID"))

To fully understand above command, take a look at read.transactions manual, available after typing ?read.transactions in R console.




回答2:


I was attempting to do the same thing and after I factored all my columns in the data.frame I was working with, I still could not coerce it into an itemMatrix of transactions. Then I realized I never re-loaded the "arules" package for the session I was working in. Very stupid mistake, but just wanted to mention it in case anyone else runs into the same problem, try the simple stuff first:

library("arules")



回答3:


You need to first convert "Transaction_id" into a factor variable.



来源:https://stackoverflow.com/questions/17313450/how-to-convert-data-frame-to-transactions-for-arules

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!