问题
I have a list of 9 data frames, each data frame having approx 100 rows and 5-6 cols.
I want to aggregate the values in a col based on the groups specified in another col across all data frames and store all results in a separate data frame. To elucidate, consider a list
[[1]]
Date Group Age
Nov A 13
Nov A 14
Nov B 9
Nov D 10
[[2]]
Date Group Age
Dec C 11
Dec C 12
Dec E 10
My code is as follows
for (i in 1:length(list)){
x<-aggregate(list[[i]]$Age~list[[i]]$Group, list[[i]], sum)
x<-rbind(x)
}
But finally, x contains only the aggregate result from dataframe 2 (since i =2) and not that of dataframe 1, though I am trying to bind the results.
Any help is much appreciated.
回答1:
In R, there are many efficiently implemented functions which help to avoid the hassle of writing for
loops.
In his comment, S Rivero has suggested to use lapply()
instead of a for
loop and to rbind()
the aggregates later:
do.call(rbind, lapply(dflist, function(x) aggregate(Age ~ Group, x, sum)))
My suggestion is to combine the data.frame
s first and then compute the aggregates using data.table
:
library(data.table)
rbindlist(dflist)[, sum(Age), by = Group]
Group V1 1: A 27 2: B 9 3: D 10 4: C 23 5: E 10
Data
dflist <- list(structure(list(Date = c("Nov", "Nov", "Nov", "Nov"), Group = c("A",
"A", "B", "D"), Age = c(13L, 14L, 9L, 10L)), .Names = c("Date",
"Group", "Age"), row.names = c(NA, -4L), class = "data.frame"),
structure(list(Date = c("Dec", "Dec", "Dec"), Group = c("C",
"C", "E"), Age = c(11L, 12L, 10L)), .Names = c("Date", "Group",
"Age"), row.names = c(NA, -3L), class = "data.frame"))
来源:https://stackoverflow.com/questions/45513776/aggregating-across-list-of-dataframes-and-storing-all-results