I have a data frame with about 200 columns, out of them I want to group the table by first 10 or so which are factors and sum the rest of the columns.
I have list of
Let's consider this example :
df <- data.frame(a = 'a', b = c('a', 'a', 'b', 'b', 'b'), c = 1:5, d = 11:15,
stringsAsFactors = TRUE)
_all
, _at
and _if
verbs are now superseded and we use across
now to group all the factor columns and sum all the other columns, we can do :
library(dplyr)
df %>%
group_by(across(where(is.factor))) %>%
summarise(across(everything(), sum))
# a b c d
#
#1 a a 3 23
#2 a b 12 42
To group all factor columns and sum numeric columns :
df %>%
group_by(across(where(is.factor))) %>%
summarise(across(where(is.numeric), sum))
We can also do this by position but have to be careful of the number since it doesn't count the grouping columns.
df %>% group_by(across(1:2)) %>% summarise(across(1:2, sum))