Add margin row totals in dplyr chain

后端 未结 7 2189
一整个雨季
一整个雨季 2020-12-05 00:25

I would like to add overall summary rows while also calculating summaries by group using dplyr. I have found various questions asking how to do this, e.g. here, here, and he

7条回答
  •  夕颜
    夕颜 (楼主)
    2020-12-05 01:22

    Here is my suggestion.

    1. Find the combination of relevant grouping variables via the powerSet function.
    2. split the data frame into a list, grouped by the powerSet of the grouping variables
    3. summarise the data frame using an appropriate summary function (e.g. mean)
    4. bind_rows the result - summaries are now NA because these columns are dropped in step 3
    5. replace NA values of the grouping variables using appropriate names.

    Note. If grouping variables are numeric, they will not be dropped in step 3 - I therefore mutate them to character variables.

    powerSetList <- function(df, ...) {
      rje::powerSet(x = c(...))[-1] %>% lapply(function(x, tdf = df) group_by(tdf, .dots=x)) %>% c(list(tibble(df)), .)
    } 
    
    mtcars %>% 
      mutate_at(vars("cyl", "gear"), as.character) %>%
      powerSetList("cyl", "gear") %>%
      map(~summarise_if(., is.numeric, .funs = mean)) %>%
      bind_rows() %>%
      replace_na(list(gear = "all gears",
                      cyl = "all cyls"))
    

提交回复
热议问题