问题
I'm looking to create a function that can take multiple string inputs (2 in this example), and using group_by, return results even if only one string is input. I know I could create if statements to get around the case when only one string is passed to the function, but is there a better way for group_by to still produce output without building in conditional language (i.e., gets more cumbersome with multiple inputs).
Reproducible example
library(dplyr)
# Create simple function
car_fx <- function(df, grp1, grp2) {
output <- df %>%
group_by(.data[[grp1]], .data[[grp2]]) %>%
summarize(mean_hp = mean(hp, na.rm = TRUE))
}
# String inputs
grp1 <- "cyl"
grp2 <- "carb"
# Run and print function output
(car_fx(mtcars, grp1, grp2))
# works fine
# A tibble: 9 x 3
# Groups: cyl [3]
cyl carb mean_hp
<dbl> <dbl> <dbl>
1 4 1 77.4
2 4 2 87
3 6 1 108.
4 6 4 116.
5 6 6 175
6 8 2 162.
7 8 3 180
8 8 4 234
9 8 8 335
If I only pass one variable, the function throws an error. What I'd like to do is have the function behave as if I only passed the single variable, and be able to use in function where I might create 3 or more variable inputs.
# Try with just one group, including with NA. Throws error.
(car_fx(mtcars, grp1))
(car_fx(mtcars, grp1, NA))
回答1:
You can use ellipsis ...
to pass an arbitrary number of arguments to a function. In this case any column names you want to use in the group_by function.
# Create simple function
car_fx <- function(df, ...) {
output <- df %>%
group_by_at(c(...)) %>%
summarize(mean_hp = mean(hp, na.rm = TRUE))
}
来源:https://stackoverflow.com/questions/61778109/handling-a-missing-string-value-to-a-function-when-using-group-by-in-dplyr