Count number of time combination of events appear in dataframe columns ext

烂漫一生 提交于 2019-11-29 13:08:45

As said, you can do this with factor() and expand.grid() (or another way to get all possible combinations)

all.possible <- expand.grid(c('a','b','c'), c('a','b','c'))
all.possible <- all.possible[all.possible[, 1] != all.possible[, 2], ]
all.possible <- unique(apply(all.possible, 1, function(x) paste(sort(x), collapse='-')))

df <- data.frame('x' = c('a', 'b', 'c', 'c', 'c'), 
                 'y' = c('c', 'c', 'a', 'a', 'b'))
table(factor(apply(df , 1, function(x) paste(sort(x), collapse='-')), levels=all.possible))

This should do it:

res = table(df)

To convert to data frame:

resdf =

The resdf data.frame looks like:

  x y Freq
1 a a    0
2 b a    0
3 c a    2
4 a b    0
5 b b    0
6 c b    1
7 a c    1
8 b c    1
9 c c    0

Note that this answer takes order into account. If ordering of the columns is unimportant, then modifying the original data.frame prior to the process will remove the effect of ordering (a-c treated the same as c-a).

df1 =,1,sort)))

An alternative, because I was a bit bored. Perhaps a bit more generalised? But probably still uglier than it could be...

df2 <-
df2$com <- apply(df2[,1:2],1,function(x) if(x[1] != x[2]) paste(sort(x),collapse='-'))
df2 <- df2[df2$com != "NULL",]
ddply(df2, .(unlist(com)), summarise, 
      num = sum(Freq))