问题
I am using the arules package to find association rules in point-of-sale retail data. I am extracting transaction detail from a database, then placing in a transaction object. I'm new to arules and am trying to figure out how to populate the itemInfo data frame in the transaction object. Right now, I'm just bringing in the transaction and item IDs (both numeric), which provide little context. I would like to be able to add an item description, as well as product hierarchy levels.
Below is the process I'm using today:
Data comes through from the database in the below format:
Transaction_ID Item_ID -------------- ----------- 100 1 100 2 100 3 101 2 101 3 102 1 102 2To create the
transactionobject, I'm using the below command, as described in thearulesdocumentation:txdata <- as(split(txdata[, "Item_ID"], txdata[, "Transaction_ID"]), "transactions")Note: I've found that I need to have a numeric value for the
Item_ID, otherwise I run into major performance issues using a string (due to poor performance of split when using factored strings).Create and view the association rules
rules <- apriori(txdata, parameter = list(support=0.00015, confidence=0.5)) inspect(head((sort(rules, by="confidence")), n=5))
When the rules come back, they are listed by Item_ID, which is not helpful to me. I want to be able to display them by the ID and/or description. Also, would like to take advantage of the aggregation features built into the arules package.
回答1:
You can change the names of items using itemInfo. Here is an example:
R> df <- data.frame(
TID = c(1,1,2,2,2,3),
item=c("a","b","a","b","c", "b")
)
R> trans <- as(split(df[,"item"], df[,"TID"]), "transactions")
### this is how you replace item labels and set a hierachy (here level1)
R> myLabels <- c("milk", "butter", "beer")
R> myLevel1 <- c("dairy", "dairy", "beverage")
R> itemInfo(trans) <- data.frame(labels = myLabels, level1 = myLevel1)
R> inspect(trans)
items transactionID
1 {milk,
butter} 1
2 {milk,
butter,
beer} 2
3 {butter} 3
### now you can use aggregate()
R> inspect(aggregate(trans, itemInfo(trans)[["level1"]]))
items transactionID
1 {dairy} 1
2 {beverage,
dairy} 2
3 {dairy} 3
You can find more info using class? transactions and ? aggregate.
Hope this helps, Michael
来源:https://stackoverflow.com/questions/28952011/adding-item-information-to-transaction-object-in-arules