I have several data frames in panel data form. Now I want to merge these panel data frames into one panel data. These data frames have common and different between them. I i
Two alternative possibilities of which especially the data.table altenative(s) are of interest when speed and memory are an issue:
base R :
Bind the dataframes together into one:
df3 <- rbind(df1,df2)
Create a reference dataframe with all possible combinations of Month and variable with expand.grid:
ref <- expand.grid(Month = unique(df3$Month), variable = unique(df3$variable))
Merge them together with all.x=TRUE so you make sure the missing combinations are filled with NA-values:
merge(ref, df3, by = c("Month", "variable"), all.x = TRUE)
Or (thanx to @PierreLafortune):
merge(ref, df3, by=1:2, all.x = TRUE)
data.table :
Bind the dataframes into one with 'rbindlist' which returns a 'data.table':
library(data.table)
DT <- rbindlist(list(df1,df2))
Join with a reference to ensure all combinations are present and missing ones are filled with NA:
DT[CJ(Month, variable, unique = TRUE), on = c(Month="V1", variable="V2")]
Everything together in one call:
DT <- rbindlist(list(df1,df2))[CJ(Month, variable, unique = TRUE), on = c(Month="V1", variable="V2")]
An alternative is wrapping rbindlist in setkey and then expanding with CJ (cross join):
DT <- setkey(rbindlist(list(df1,df2)), Month, variable)[CJ(Month, variable, unique = TRUE)]