Count number of time combination of events appear in dataframe columns ext

烂漫一生 提交于 2019-11-29 13:08:45

As said, you can do this with factor() and expand.grid() (or another way to get all possible combinations)

all.possible <- expand.grid(c('a','b','c'), c('a','b','c'))
all.possible <- all.possible[all.possible[, 1] != all.possible[, 2], ]
all.possible <- unique(apply(all.possible, 1, function(x) paste(sort(x), collapse='-')))

df <- data.frame('x' = c('a', 'b', 'c', 'c', 'c'), 
                 'y' = c('c', 'c', 'a', 'a', 'b'))
table(factor(apply(df , 1, function(x) paste(sort(x), collapse='-')), levels=all.possible))

This should do it:

res = table(df)

To convert to data frame:

resdf = as.data.frame(res)

The resdf data.frame looks like:

  x y Freq
1 a a    0
2 b a    0
3 c a    2
4 a b    0
5 b b    0
6 c b    1
7 a c    1
8 b c    1
9 c c    0

Note that this answer takes order into account. If ordering of the columns is unimportant, then modifying the original data.frame prior to the process will remove the effect of ordering (a-c treated the same as c-a).

df1 = as.data.frame(t(apply(df,1,sort)))

An alternative, because I was a bit bored. Perhaps a bit more generalised? But probably still uglier than it could be...

df2 <- as.data.frame(table(df))
df2$com <- apply(df2[,1:2],1,function(x) if(x[1] != x[2]) paste(sort(x),collapse='-'))
df2 <- df2[df2$com != "NULL",]
ddply(df2, .(unlist(com)), summarise, 
      num = sum(Freq))
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!