Group by multiple columns and sum other multiple columns

后端 未结 7 640
孤城傲影
孤城傲影 2020-11-22 07:35

I have a data frame with about 200 columns, out of them I want to group the table by first 10 or so which are factors and sum the rest of the columns.

I have list of

7条回答
  •  误落风尘
    2020-11-22 08:29

    Let's consider this example :

    df <- data.frame(a = 'a', b = c('a', 'a', 'b', 'b', 'b'), c = 1:5, d = 11:15,
                     stringsAsFactors = TRUE)
    

    _all, _at and _if verbs are now superseded and we use across now to group all the factor columns and sum all the other columns, we can do :

    library(dplyr)
    
    df %>% 
       group_by(across(where(is.factor))) %>% 
       summarise(across(everything(), sum))
    
    #  a     b         c     d
    #     
    #1 a     a         3    23
    #2 a     b        12    42
    

    To group all factor columns and sum numeric columns :

    df %>% 
      group_by(across(where(is.factor))) %>% 
      summarise(across(where(is.numeric), sum))
    

    We can also do this by position but have to be careful of the number since it doesn't count the grouping columns.

    df %>% group_by(across(1:2)) %>% summarise(across(1:2, sum))
    

提交回复
热议问题