Aggregating across list of dataframes and storing all results

戏子无情 提交于 2019-12-19 11:58:10

问题


I have a list of 9 data frames, each data frame having approx 100 rows and 5-6 cols.

I want to aggregate the values in a col based on the groups specified in another col across all data frames and store all results in a separate data frame. To elucidate, consider a list

    [[1]]  
    Date  Group  Age
    Nov     A    13
    Nov     A    14
    Nov     B    9
    Nov     D    10
    [[2]]
    Date  Group  Age
    Dec     C    11
    Dec     C    12
    Dec     E    10

My code is as follows

for (i in 1:length(list)){
x<-aggregate(list[[i]]$Age~list[[i]]$Group, list[[i]], sum)
x<-rbind(x)
}

But finally, x contains only the aggregate result from dataframe 2 (since i =2) and not that of dataframe 1, though I am trying to bind the results.

Any help is much appreciated.


回答1:


In R, there are many efficiently implemented functions which help to avoid the hassle of writing for loops.

In his comment, S Rivero has suggested to use lapply() instead of a for loop and to rbind() the aggregates later:

do.call(rbind, lapply(dflist, function(x) aggregate(Age ~ Group, x, sum)))

My suggestion is to combine the data.frames first and then compute the aggregates using data.table:

library(data.table)
rbindlist(dflist)[, sum(Age), by = Group]
   Group V1
1:     A 27
2:     B  9
3:     D 10
4:     C 23
5:     E 10

Data

dflist <- list(structure(list(Date = c("Nov", "Nov", "Nov", "Nov"), Group = c("A", 
"A", "B", "D"), Age = c(13L, 14L, 9L, 10L)), .Names = c("Date", 
"Group", "Age"), row.names = c(NA, -4L), class = "data.frame"), 
    structure(list(Date = c("Dec", "Dec", "Dec"), Group = c("C", 
    "C", "E"), Age = c(11L, 12L, 10L)), .Names = c("Date", "Group", 
    "Age"), row.names = c(NA, -3L), class = "data.frame"))


来源:https://stackoverflow.com/questions/45513776/aggregating-across-list-of-dataframes-and-storing-all-results

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!