reshaping data (a faster way)

旧街凉风 提交于 2019-12-05 09:32:27

Here is a one-liner.

dat2 <- ddply(dat, 1:4, summarize, sex = c(rep('m', m), rep('f', f)))

And here's a base R one-liner.

dat2 <- cbind(dat[c(rep(1:nrow(dat), dat$m), rep(1:nrow(dat), dat$f)),1:4],
              sex=c(rep("m",sum(dat$m)), rep("f", sum(dat$f))))

Or, a little more generally:

d1 <- dat[,1:4]
d2 <- as.matrix(dat[,5:6])
dat2 <- cbind(d1[rep(rep(1:nrow(dat), ncol(d2)), d2),], 
              sex=rep(colnames(d2), colSums(d2)))

Given that nobody has posted a data.table solution (as suggested in the original question)

library(data.table)
DT <- as.data.table(dat)   
DT[,list(sex = rep(c('m','f'),c(m,f))), by=  list(i1,i2,i3,i4)]

Or, even more succinctly

DT[,list(sex = rep(c('m','f'),c(m,f))), by=  'i1,i2,i3,i4']

I would use melt for the first step and ddply for the second.

library(reshape2)
library(plyr)
d <- ddply( 
  melt(dat, id.vars=c("i1","i2","i3","i4"), variable.name="sex"), 
  c("i1","i2","i3","i4","sex"), 
  summarize, 
  id=rep(1,value) 
)
d$id <- cumsum(d$id)
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!