Generate All ID Pairs, by group with data.table in R

China☆狼群 提交于 2019-12-13 07:26:21

问题


I have a data.table with many individuals (with ids) in many groups. Within each group, I would like to find every combination of ids (every pair of individuals). I know how to do this with a split-apply-combine approach, but I am hoping that a data.table would be faster.

Sample data:

dat <- data.table(ids=1:20, groups=sample(x=c("A","B","C"), 20, replace=TRUE))

Split-Apply-Combine Method:

datS <- split(dat, f=dat$groups)

datSc <- lapply(datS, function(x){ as.data.table(t(combn(x$ids, 2)))})

rbindlist(datSc)

head(rbindlist(datSc))
V1 V2
1:  2  5
2:  2 10
3:  2 19
4:  5 10
5:  5 19
6: 10 19

My best data.table attempt produces a single column, not two columns with all the possible combinations:

dat[, combn(x=ids, m=2), by=groups]

Thanks in advance.


回答1:


You need to convert the result from t(combn()) which is a matrix to a data.table or data.frame, so this should work:

library(data.table)  
set.seed(10)
dat <- data.table(ids=1:20, groups=sample(x=c("A","B","C"), 20, replace=TRUE))
dt <- dat[, as.data.table(t(combn(ids, 2))), .(groups)]
head(dt)
   groups V1 V2
1:      C  1  3
2:      C  1  5
3:      C  1  7
4:      C  1 10
5:      C  1 13
6:      C  1 14



回答2:


library(data.table)  
dat <- data.table(ids=1:20, groups=sample(x=c("A","B","C"), 20, replace=TRUE))
ind<-unique(dat$groups)
lapply(1:length(ind), function (i) combn(dat$ids[which(dat$groups==ind[i])],2))

You can then change the list to any other type of format you might need.



来源:https://stackoverflow.com/questions/37333996/generate-all-id-pairs-by-group-with-data-table-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!