Handling a missing string value to a function when using group_by in dplyr

五迷三道 提交于 2020-05-14 07:17:42

问题


I'm looking to create a function that can take multiple string inputs (2 in this example), and using group_by, return results even if only one string is input. I know I could create if statements to get around the case when only one string is passed to the function, but is there a better way for group_by to still produce output without building in conditional language (i.e., gets more cumbersome with multiple inputs).

Reproducible example

library(dplyr)

# Create simple function
car_fx <- function(df, grp1, grp2) {
  output <- df %>% 
    group_by(.data[[grp1]], .data[[grp2]]) %>% 
    summarize(mean_hp = mean(hp, na.rm = TRUE))
}



# String inputs
grp1 <- "cyl"
grp2 <- "carb"



# Run and print function output
(car_fx(mtcars, grp1, grp2))

# works fine
# A tibble: 9 x 3
# Groups:   cyl [3]
    cyl  carb mean_hp
  <dbl> <dbl>   <dbl>
1     4     1    77.4
2     4     2    87  
3     6     1   108. 
4     6     4   116. 
5     6     6   175  
6     8     2   162. 
7     8     3   180  
8     8     4   234  
9     8     8   335 

If I only pass one variable, the function throws an error. What I'd like to do is have the function behave as if I only passed the single variable, and be able to use in function where I might create 3 or more variable inputs.

# Try with just one group, including with NA.  Throws error.
(car_fx(mtcars, grp1))
(car_fx(mtcars, grp1, NA))

回答1:


You can use ellipsis ... to pass an arbitrary number of arguments to a function. In this case any column names you want to use in the group_by function.

# Create simple function
car_fx <- function(df, ...) {
  output <- df %>% 
    group_by_at(c(...)) %>% 
    summarize(mean_hp = mean(hp, na.rm = TRUE))
}


来源:https://stackoverflow.com/questions/61778109/handling-a-missing-string-value-to-a-function-when-using-group-by-in-dplyr

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!