How to expand a large dataframe in R

风流意气都作罢 提交于 2019-11-28 12:58:55

expand.grid is a useful function here,

mergedData <- merge(
    expand.grid(id = unique(df$id), spp = unique(df$spp)),
    df, by = c("id", "spp"), all =T)

mergedData[is.na(mergedData$y), ]$y <- 0

mergedData$date <- rep(levels(df$date),
                       each = length(levels(df$spp)))

Since you're not actually doing anything to subsets of the data I don't think plyr will help, maybe more efficient ways with data.table.

I would go the second way, hope this helps

x<-unique(df$id)
y<-unique(df$spp)
newdf<-data.frame(x=rep(x,each=length(y)),y=rep(y, length(x)))
merged<-merge(newdf, df, by.x=c(x,y), by.y=c("id","spp"), all=T)

There is a new function complete in the development version of tidyr that does this. Of course complete uses expand.grid internally.

# get new version of tidyr
devtools::install_github("hadley/tidyr")
# load package
require(tidyr)
# calculations
complete(df, c(id, date), spp, fill = list(y = 0))
##    id       date spp y
## 1   1 1985-06-19   a 5
## 2   1 1985-06-19   b 3
## 3   1 1985-06-19   c 5
## 4   1 1985-06-19   d 0
## 5   2 1985-08-01   a 0
## 6   2 1985-08-01   b 0
## 7   2 1985-08-01   c 4
## 8   2 1985-08-01   d 9
## 9   3 1990-06-19   a 8
## 10  3 1990-06-19   b 3
## 11  3 1990-06-19   c 5
## 12  3 1990-06-19   d 6
## 13  4 2000-05-12   a 0
## 14  4 2000-05-12   b 3
## 15  4 2000-05-12   c 0
## 16  4 2000-05-12   d 0
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!