How to do Group By Rollup in R? (Like SQL)

后端未结

关注

 3  467

陌清茗

I have a dataset and I want to perform something like Group By Rollup like we have in SQL for aggregate values.

Below is a reproducible example. I k

相关标签:

3条回答

遇见更好的自我

2020-12-19 19:55

In recent devel data.table 1.10.5 you can use new feature called "grouping sets" to produce sub totals:

library(data.table)
setDT(df)
res = groupingsets(df, .(sales=sum(sales)), sets=list(c("year","month"), c("year","month","region")), by=c("year","month","region"))
setorder(res, na.last=TRUE)
res
#   year month region sales
#1: 2016     1   east   400
#2: 2016     1   west   600
#3: 2016     1     NA  1000
#4: 2017     2   east   800
#5: 2017     2   west  1200
#6: 2017     2     NA  2000

You can substitute NA to USA using res[is.na(region), region := "USA"].

0 讨论(0)

陌清茗

2020-12-19 19:57

melt/dcast in the reshape2 package can do subtotalling. After running dcast we replace "(all)" in the month column with the month using na.locf from the zoo package:

library(reshape2)
library(zoo)

m <- melt(df, measure.vars = "sales")
dout <- dcast(m, year + month + region ~ variable, fun.aggregate = sum, margins = "month")

dout$month <- na.locf(replace(dout$month, dout$month  == "(all)", NA))

giving:

> dout
  year month region sales
1 2016     1   east   400
2 2016     1   west   600
3 2016     1  (all)  1000
4 2017     2   east   800
5 2017     2   west  1200
6 2017     2  (all)  2000

0 讨论(0)

小鲜肉

2020-12-19 20:13

plyr::ddply(df, c("year", "month", "region"), plyr::summarise, sales = sum(sales))

0 讨论(0)