R user-defined/dynamic summary function within dplyr::summarise

|▌冷眼眸甩不掉的悲伤 提交于 2021-02-11 12:14:29

问题


Somewhat hard to define this question without sounding like lots of similar questions!

I have a function for which I want one of the parameters to be a function name, that will be passed to dplyr::summarise, e.g. "mean" or "sum":

data(mtcars)
  f <- function(x = mtcars,
                groupcol = "cyl",
                zCol = "disp",
                zFun = "mean") {
    
    zColquo = quo_name(zCol)
    
    cellSummaries <- x %>%
      group_by(gear, !!sym(groupcol)) %>% # 1 preset grouper, 1 user-defined
      summarise(Count = n(), # 1 preset summary, 1 user defined
                !!zColquo := mean(!!sym(zColquo))) # mean should be zFun, user-defined
    ungroup
  }

(this groups by gear and cyl, then returns, per group, count and mean(disp))

Per my note, I'd like 'mean' to be dynamic, performing the function defined by zFun, but I can't for the life of me work out how to do it! Thanks in advance for any advice.


回答1:


You can use match.fun to make the function dynamic. I also removed zColquo as it's not needed.

library(dplyr)
library(rlang)

f <- function(x = mtcars,
              groupcol = "cyl",
              zCol = "disp",
              zFun = "mean") {

  cellSummaries <- x %>%
                   group_by(gear, !!sym(groupcol)) %>% 
                   summarise(Count = n(), 
                             !!zCol := match.fun(zFun)(!!sym(zCol))) %>%
                   ungroup

  return(cellSummaries)
}

You can then check output

f()

# A tibble: 8 x 4
#   gear   cyl Count  disp
#  <dbl> <dbl> <int> <dbl>
#1     3     4     1  120.
#2     3     6     2  242.
#3     3     8    12  358.
#4     4     4     8  103.
#5     4     6     4  164.
#6     5     4     2  108.
#7     5     6     1  145 
#8     5     8     2  326 

f(zFun = "sum")

# A tibble: 8 x 4
#   gear   cyl Count  disp
#  <dbl> <dbl> <int> <dbl>
#1     3     4     1  120.
#2     3     6     2  483 
#3     3     8    12 4291.
#4     4     4     8  821 
#5     4     6     4  655.
#6     5     4     2  215.
#7     5     6     1  145 
#8     5     8     2  652 



回答2:


We can use get

library(dplyr)    
f <- function(x = mtcars,
            groupcol = "cyl",
            zCol = "disp",
            zFun = "mean") {


  zColquo = quo_name(zCol)
  x %>%
  group_by(gear, !!sym(groupcol)) %>% # 1 preset grouper, 1 user-defined
  summarise(Count = n(), # 1 preset summary, 1 user defined
            !!zColquo := get(zFun)(!!sym(zCol))) %>% 
ungroup
 }

f()
# A tibble: 8 x 4
#   gear   cyl Count  disp
#  <dbl> <dbl> <int> <dbl>
#1     3     4     1  120.
#2     3     6     2  242.
#3     3     8    12  358.
#4     4     4     8  103.
#5     4     6     4  164.
#6     5     4     2  108.
#7     5     6     1  145 
#8     5     8     2  326 


f(zFun = "sum")
# A tibble: 8 x 4
#   gear   cyl Count  disp
#  <dbl> <dbl> <int> <dbl>
#1     3     4     1  120.
#2     3     6     2  483 
#3     3     8    12 4291.
#4     4     4     8  821 
#5     4     6     4  655.
#6     5     4     2  215.
#7     5     6     1  145 
#8     5     8     2  652 

In addition, we could remove the sym evaluation in group_by and in summarise if we wrap with across

f <- function(x = mtcars,
            groupcol = "cyl",
            zCol = "disp",
            zFun = "mean") {



 x %>%
    group_by(across(c(gear, groupcol))) %>% # 1 preset grouper, 1 user-defined
    summarise(Count = n(), # 1 preset summary, 1 user defined
            across(zCol, ~ get(zFun)(.))) %>% 
    ungroup
 }
f()
# A tibble: 8 x 4
#   gear   cyl Count  disp
#  <dbl> <dbl> <int> <dbl>
#1     3     4     1  120.
#2     3     6     2  242.
#3     3     8    12  358.
#4     4     4     8  103.
#5     4     6     4  164.
#6     5     4     2  108.
#7     5     6     1  145 
#8     5     8     2  326 


来源:https://stackoverflow.com/questions/62906259/r-user-defined-dynamic-summary-function-within-dplyrsummarise

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!