Adding item information to transaction object in arules

不羁的心 提交于 2019-12-23 01:57:15

问题


I am using the arules package to find association rules in point-of-sale retail data. I am extracting transaction detail from a database, then placing in a transaction object. I'm new to arules and am trying to figure out how to populate the itemInfo data frame in the transaction object. Right now, I'm just bringing in the transaction and item IDs (both numeric), which provide little context. I would like to be able to add an item description, as well as product hierarchy levels.

Below is the process I'm using today:

  1. Data comes through from the database in the below format:

    Transaction_ID     Item_ID
    --------------     ----------- 
    100                1
    100                2
    100                3
    101                2
    101                3
    102                1
    102                2
    
  2. To create the transaction object, I'm using the below command, as described in the arules documentation:

    txdata <- as(split(txdata[, "Item_ID"], txdata[, "Transaction_ID"]), "transactions")
    

    Note: I've found that I need to have a numeric value for the Item_ID, otherwise I run into major performance issues using a string (due to poor performance of split when using factored strings).

  3. Create and view the association rules

    rules <- apriori(txdata, parameter = list(support=0.00015, confidence=0.5))
    inspect(head((sort(rules, by="confidence")), n=5))
    

When the rules come back, they are listed by Item_ID, which is not helpful to me. I want to be able to display them by the ID and/or description. Also, would like to take advantage of the aggregation features built into the arules package.


回答1:


You can change the names of items using itemInfo. Here is an example:

R> df <- data.frame(
   TID = c(1,1,2,2,2,3), 
   item=c("a","b","a","b","c", "b")
 )
R> trans <- as(split(df[,"item"], df[,"TID"]), "transactions")

### this is how you replace item labels and set a hierachy (here level1)
R> myLabels <- c("milk", "butter", "beer")
R> myLevel1 <- c("dairy", "dairy", "beverage")
R> itemInfo(trans) <- data.frame(labels = myLabels, level1 = myLevel1)

R> inspect(trans)
     items    transactionID
  1 {milk,                
     butter}             1
  2 {milk,                
     butter,              
     beer}               2
  3 {butter}             3

 ### now you can use aggregate()
 R> inspect(aggregate(trans, itemInfo(trans)[["level1"]]))
     items      transactionID
  1 {dairy}                1
  2 {beverage,              
     dairy}                2
  3 {dairy}                3

You can find more info using class? transactions and ? aggregate.

Hope this helps, Michael



来源:https://stackoverflow.com/questions/28952011/adding-item-information-to-transaction-object-in-arules

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!