Dplyr summarise_each to aggregate results

柔情痞子 提交于 2019-11-30 15:10:14

Try recast from reshape2 package

library(reshape2)
recast(DF, variable ~ field1 + field2, sum)
#   variable     L_S1      L_S2       L_S4       S_S1       S_S2      S_S3
# 1  metric1 1.078097  1.736221  0.4187475 -0.2708038  0.3283072  2.033338
# 2  metric2 4.256988  6.784695 17.9023881  2.4908063 -0.8374830  4.047061
# 3  metric3 7.171010 11.311641  4.8122051 26.0352489  1.8973718 13.248310

which is the same as

dcast(melt(DF, c("field1", "field2")), variable ~ field1 + field2, sum)

You also can combine it with tidyr::gather if you want, but you can't use tidyr::spread because it doesn't have fun.aggregate argument

DF %>%
  gather(variable, value, -(field1:field2)) %>%
  dcast(variable ~ field1 + field2, sum)
#   variable     L_S1      L_S2       L_S4       S_S1       S_S2      S_S3
# 1  metric1 1.078097  1.736221  0.4187475 -0.2708038  0.3283072  2.033338
# 2  metric2 4.256988  6.784695 17.9023881  2.4908063 -0.8374830  4.047061
# 3  metric3 7.171010 11.311641  4.8122051 26.0352489  1.8973718 13.248310

For an all dplyr and tidyr solution, you could do:

library(dplyr)
library(tidyr)

df %>% 
  unite(variable, field1, field2) %>% 
  group_by(variable) %>% 
  summarise_each(funs(sum)) %>% 
  gather(metrics, value, -variable) %>%
  spread(variable, value)

Which gives:

#Source: local data frame [3 x 7]
#
#  metrics     L_S1      L_S2       L_S4       S_S1       S_S2      S_S3
#1 metric1 1.078097  1.736221  0.4187475 -0.2708038  0.3283072  2.033338
#2 metric2 4.256988  6.784695 17.9023881  2.4908063 -0.8374830  4.047061
#3 metric3 7.171010 11.311641  4.8122051 26.0352489  1.8973718 13.248310

Edit

After reading your comment on David's answer, I think this is closer to your expected output:

field1 <- group_by(df, field = field1) %>% summarise_each(funs(sum), -(field1:field2)) 
field2 <- group_by(df, field = field2) %>% summarise_each(funs(sum), -(field1:field2)) 

bind_rows(field1, field2) %>%
  gather(metrics, value, -field) %>%
  spread(field, value)

Which gives:

#Source: local data frame [3 x 7]
#
#  metrics         L         S         S1        S2        S3         S4
#1 metric1  3.233065  2.090842  0.8072928  2.064528  2.033338  0.4187475
#2 metric2 28.944071  5.700384  6.7477945  5.947212  4.047061 17.9023881
#3 metric3 23.294855 41.180931 33.2062584 13.209013 13.248310  4.8122051
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!