Passing strings as arguments in dplyr verbs

前端 未结 3 800
时光取名叫无心
时光取名叫无心 2021-02-19 17:01

I would like to be able to define arguments for dplyr verbs

condition <- \"dist > 50\"

and then use these strings in

相关标签:
3条回答
  • 2021-02-19 17:17

    In the next version of dplyr, it will probably work like this:

    condition <- quote(dist > 50)
    
    mtcars %>%
       filter_(condition)
    
    0 讨论(0)
  • 2021-02-19 17:19

    While they're working on that, here is a workaround using if:

    library(dplyr)
    library(magrittr)
    
    ds <- data.frame(attend = c(1:5,NA,7:9,NA,NA,12))
    
    filter_na <- FALSE
    
    filtertest <- function(x,filterTF = filter_na){
      if(filterTF) x else !(x)
    }
    
    ds %>%
      filter(attend %>% is.na %>% filtertest)
    
      attend
    1      1
    2      2
    3      3
    4      4
    5      5
    6      7
    7      8
    8      9
    9     12
    
    filter_na <- TRUE
    ds %>%
      filter(attend %>% is.na %>% filtertest)
    
      attend
    1     NA
    2     NA
    3     NA
    
    0 讨论(0)
  • 2021-02-19 17:25

    Since these 2014 answers, two new ways are possible using rlang's quasiquotation.

    Conventional hard-coded filter statement. For the sake of comparison, the statement dist > 50 is included directly in dplyr::filter().

    library(magrittr)
    
    # The filter statement is hard-coded inside the function.
    cars_subset_0 <- function( ) {
      cars %>%
        dplyr::filter(dist > 50)
    }
    cars_subset_0()
    

    results:

       speed dist
    1     14   60
    2     14   80
    3     15   54
    4     18   56
    ...
    17    25   85
    

    rlang approach with NSE (nonstandard evaluation). As described in the Programming with dplyr vignette, the statement dist > 50 is processed by rlang::enquo(), which "uses some dark magic to look at the argument, see what the user typed, and return that value as a quosure". Then rlang's !! unquotes the input "so that it’s evaluated immediately in the surrounding context".

    # The filter statement is evaluated with NSE.
    cars_subset_1 <- function( filter_statement ) {
      filter_statement_en <- rlang::enquo(filter_statement)
      message("filter statement: `", filter_statement_en, "`.")
    
      cars %>%
        dplyr::filter(!!filter_statement_en)
    }
    cars_subset_1(dist > 50)
    

    results:

    filter statement: `~dist > 50`.
    <quosure>
    expr: ^dist > 50
    env:  global
       speed dist
    1     14   60
    2     14   80
    3     15   54
    4     18   56
    17    25   85
    

    rlang approach passing a string. The statement "dist > 50" is passed to the function as an explicit string, and parsed as an expression by rlang::parse_expr(), then unquoted by !!.

    # The filter statement is passed a string.
    cars_subset_2 <- function( filter_statement ) {
      filter_statement_expr <- rlang::parse_expr(filter_statement)
      message("filter statement: `", filter_statement_expr, "`.")
    
      cars %>%
        dplyr::filter(!!filter_statement_expr)
    }
    cars_subset_2("dist > 50")
    

    results:

    filter statement: `>dist50`.
       speed dist
    1     14   60
    2     14   80
    3     15   54
    4     18   56
    ...
    17    25   85
    

    Things are simpler with dplyr::select(). Explicit strings need only !!.

    # The select statement is passed a string.
    cars_subset_2b <- function( select_statement ) {
      cars %>%
        dplyr::select(!!select_statement)
    }
    cars_subset_2b("dist")
    
    0 讨论(0)
提交回复
热议问题