I have code that at one place ends up with a list of data frames which I really want to convert to a single big data frame.
I got some pointers from an earlier ques
One other option is to use a plyr function:
df <- ldply(listOfDataFrames, data.frame)
This is a little slower than the original:
> system.time({ df <- do.call("rbind", listOfDataFrames) })
user system elapsed
0.25 0.00 0.25
> system.time({ df2 <- ldply(listOfDataFrames, data.frame) })
user system elapsed
0.30 0.00 0.29
> identical(df, df2)
[1] TRUE
My guess is that using do.call("rbind", ...)
is going to be the fastest approach that you will find unless you can do something like (a) use a matrices instead of a data.frames and (b) preallocate the final matrix and assign to it rather than growing it.
Edit 1:
Based on Hadley's comment, here's the latest version of rbind.fill
from CRAN:
> system.time({ df3 <- rbind.fill(listOfDataFrames) })
user system elapsed
0.24 0.00 0.23
> identical(df, df3)
[1] TRUE
This is easier than rbind, and marginally faster (these timings hold up over multiple runs). And as far as I understand it, the version of plyr on github is even faster than this.