dplyr pipeline in a function

这一生的挚爱 提交于 2021-02-05 06:37:45

问题


I'm trying to put a dplyr pipeline in a function but after reading the vignette multiple times as well as the tidy evaluation (https://tidyeval.tidyverse.org/dplyr.html). I still can't get it to work...

#Sample data:
dat <- read.table(text = "A ID B
1   X   83
2   X   NA
3   X   NA
4   Y   NA
5   X   2
6   Y   2
12   Y   10
7   Y   18
8   Y   85", header = TRUE)

# What I'm trying to do:
x <- dat %>% filter(!is.na(B)) %>% count('ID') %>% filter(freq>3)
x$ID

# Now in a function:
n_occurences <- function(df, n, column){
  # Group by ID and return IDs with number of non-na > n in column
  column <- enquo(column)
  x <- df %>%
       filter(!is.na(!!column))  %>%
       count('ID') %>% filter(freq>n)
  x$ID
}

# Let's try:
col <- 'B'
n_occurences(dat, n=3, column = col)

There is no error, but the output is wrong. This as something to do with the tidy evaluation, but I just can't get my head around it.


回答1:


With rlang_0.40, we can do this much easier by using the {{...}} or curly-curly operator

library(rlang)
library(dplyr)
n_occurences <- function(df, n1, column){

 df %>%
   filter(!is.na({{column}}))  %>%
    count(ID) %>% 
    filter(n > n1) %>%
    pull(ID)

 }     

n_occurences(dat, n1 = 3, column = B)
#[1] Y
#Levels: X Y

If we intend to pass a quoted string, convert it to symbol (sym) and then do the evaluation (!!)

n_occurences <- function(df, n1, column){

  column <- rlang::sym(column)
 df %>%
       filter(!is.na(!!column))  %>%
       count(ID) %>% 
       filter(n > n1) %>%
       pull(ID)

}


col <- 'B'
n_occurences(dat, n1=3, column = col)
#[1] Y
#Levels: X Y


来源:https://stackoverflow.com/questions/56944486/dplyr-pipeline-in-a-function

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!