rbinding a list of lists of dataframes based on nested order

问题

I have a dataframe, df and a function process that returns a list of two dataframes, a and b. I use dlply to split up the df on an id column, and then return a list of lists of dataframes. Here's sample data/code that approximates the actual data and methods:

df <- data.frame(id1=rep(c(1,2,3,4), each=2))

process <- function(df) {
  a <- data.frame(d1=rnorm(1), d2=rnorm(1))
  b <- data.frame(id1=df$id1, a=rnorm(nrow(df)), b=runif(nrow(df)))
  list(a=a, b=b)
}

require(plyr)
output <- dlply(df, .(id1), process)

output is a list of lists of dataframes, the nested list will always have two dataframes, named a and b. In this case the outer list has a length 4.

What I am looking to generate is a dataframe with all the a dataframes, along with an id column indicating their respective value (I believe this is left in the list as the split_labels attribute, see str(output)). Then similarly for the b dataframes.

So far I have in part used this question to come up with this code:

list <- unlist(output, recursive = FALSE)
list.a <- lapply(1:4, function(x) {
  list[[(2*x)-1]]
})
all.a <- rbind.fill(list.a)

Which gives me the final a dataframe (and likewise for b with a different subscript into list), however it doesn't have the id column I need and I'm pretty sure there's got to be a more straightforward or elegant solution. Ideally something clean using plyr.

回答1:

Not very clean but you can try something like this (assuming the same data generation process).

list.aID <- lapply(1:4, function(x) {
cbind(list[[(2*x) - 1]], list[[2*x]][1, 1, drop = FALSE])
})

all.aID <- rbind.fill(list.aID)
all.aID

all.aID
        d1       d2 id1
1  0.68103 -0.74023   1
2 -0.50684  1.23713   2
3  0.33795 -0.37277   3
4  0.37827  0.56892   4

来源：https://stackoverflow.com/questions/11934774/rbinding-a-list-of-lists-of-dataframes-based-on-nested-order

标签

list

dataframe

plyr