When I use group_by and summarise in dplyr, I can naturally apply different summary functions to different variables. For instance:
library(tidyverse)
Here is one idea.
library(tidyverse)
df_mean <- df %>%
group_by(category) %>%
summarize_at(vars(x), funs(mean(.)))
df_median <- df %>%
group_by(category) %>%
summarize_at(vars(y), funs(median(.)))
df_first <- df %>%
group_by(category) %>%
summarize_at(vars(z), funs(first(.)))
df_summary <- reduce(list(df_mean, df_median, df_first),
left_join, by = "category")
Like you said, there is no need to use summarise_at
for this example. However, if you have a lot of columns need to be summarized by different functions, this strategy may work. You will need to specify the columns in the vars(...)
for each summarize_at
. The rule is the same as the dplyr::select
function.
Here is another idea. Define a function which modifies the summarise_at
function, and then use map2
to apply this function with a look-up list showing variables and associated functions to apply. In this example, I applied mean
to x
and y
column and median
to z
.
# Define a function
summarise_at_fun <- function(variable, func, data){
data2 <- data %>%
summarise_at(vars(variable), funs(get(func)(.)))
return(data2)
}
# Group the data
df2 <- df %>% group_by(category)
# Create a look-up list with function names and variable to apply
look_list <- list(mean = c("x", "y"),
median = "z")
# Apply the summarise_at_fun
map2(look_list, names(look_list), summarise_at_fun, data = df2) %>%
reduce(left_join, by = "category")
# A tibble: 3 x 4
category x y z
<chr> <dbl> <dbl> <dbl>
1 a 6 6 0
2 b 5 3 8
3 c 2 6 1
Since your question is about "summarise_at";
Here is what my idea is:
df %>% group_by(category) %>%
summarise_at(vars(x, y, z),
funs(mean = mean, sd = sd, min = min),
na.rm = TRUE)