Count Occurrences of a List in R

后端 未结 2 826
甜味超标
甜味超标 2021-01-06 11:19

I have a list of roughly 100,000 occurrences of items being ordered together that I have pasted into one column so I can count the number of times each combination occurs. <

2条回答
  •  粉色の甜心
    2021-01-06 11:59

    Your initial approach was pretty close to what I think you want. Combining those into a single factor will definitely work, provided you combine them in the same order, such that you don't end up with "Fries, Burger" and "Burger, Fries."

    There may be an easier way of doing what you want, but I'm failing to brain what that is. Nevertheless, I think this does what you're looking for:

    # Let's assume your data looks like this:
    > df
                           Var1                      Var2 Var3
    1               Onion Rings               Onion Rings    1
    2  Pineapple Cheddar Burger               Onion Rings    1
    3               Onion Rings  Pineapple Cheddar Burger    1
    4  Pineapple Cheddar Burger  Pineapple Cheddar Burger    1
    5               Onion Rings               Onion Rings    1
    6  Pineapple Cheddar Burger               Onion Rings    1
    7               Onion Rings  Pineapple Cheddar Burger    1
    8  Pineapple Cheddar Burger  Pineapple Cheddar Burger    1
    9             Fountain Soda             Fountain Soda    1
    10             French Fries             Fountain Soda    1
    
    # Now, for each row
    #     1. sort the Var1 and Var2,
    #     2. combine the sorted vars, and
    #     3. convert them back into a factor
    
    df$sortcomb <- as.factor(apply(df[,1:2], 1, function(x) paste(sort(x), collapse=", ")))
    
    table(df$sortcomb) # then use table as per normal
    
    ddply(df, .(sortcomb), summarize, count=length(sortcomb)) # or ddply
    

提交回复
热议问题