Extract column name in mutate_if call

前端 未结 2 1758
南方客
南方客 2020-12-15 11:33

I would like to extract the column name in the function call to mutate_if. With this, I then want to look up a value in a different table and fill in missing va

相关标签:
2条回答
  • 2020-12-15 12:03

    Julien Nvarre's answer is absolutely correct (you need to use quo) but, since my first thought would also have been to use enquo I have looked at why you have to use quo instead:

    If we look at the source for mutate_if we can see how it is constructed:

    dplyr:::mutate_if
    #> function (.tbl, .predicate, .funs, ...) 
    #> {
    #>     funs <- manip_if(.tbl, .predicate, .funs, enquo(.funs), caller_env(), 
    #>         ...)
    #>     mutate(.tbl, !(!(!funs)))
    #> }
    #> <environment: namespace:dplyr>
    

    By overriding the mutate_if function in dplyr with a slight modification, I can insert a call to print() allowing me to look at the funs object being passed to mutate:

    mutate_if <- function (.tbl, .predicate, .funs, ...) 
    {
      funs <- dplyr:::manip_if(.tbl, .predicate, .funs, enquo(.funs), caller_env(), 
                       ...)
      print(funs)
    }
    

    Then, running your code will use this modified mutate_if function::

    df <- structure(list(x = 1:10, 
                         y = c(1L, 2L, 3L, NA, 1L, 2L, 3L, NA, 1L, 2L), 
                         z = c(NA, 2L, 3L, NA, NA, 2L, 3L, NA, NA, 2L), 
                         a = c("a", "b", "c", "d", "e", "a", "b", "c", "d", "e")), 
                    .Names = c("x", "y", "z", "a"), 
                    row.names = c(NA, -10L), 
                    class = c("tbl_df", "tbl", "data.frame"))
    df_lookup <- tibble(x = 0L, y = 5L, z = 8L)
    
    df %>% 
      mutate_if(is.numeric, funs({
        x <- .
        x <- enquo(x)
        lookup_value <- df_lookup %>% pull(quo_name(x))
        x <- ifelse(is.na(x), lookup_value, x)
        return(x)
      }))
    #> $x
    #> <quosure>
    #>   expr: ^{
    #>           x <- x
    #>           x <- enquo(x)
    #>           lookup_value <- df_lookup %>% pull(quo_name(x))
    #>           x <- ifelse(is.na(x), lookup_value, x)
    #>           return(x)
    #>         }
    #>   env:  0000000007FBBFA0
    #> 
    #> $y
    #> <quosure>
    #>   expr: ^{
    #>           x <- y
    #>           x <- enquo(x)
    #>           lookup_value <- df_lookup %>% pull(quo_name(x))
    #>           x <- ifelse(is.na(x), lookup_value, x)
    #>           return(x)
    #>         }
    #>   env:  0000000007FBBFA0
    #> 
    #> $z
    #> <quosure>
    #>   expr: ^{
    #>           x <- z
    #>           x <- enquo(x)
    #>           lookup_value <- df_lookup %>% pull(quo_name(x))
    #>           x <- ifelse(is.na(x), lookup_value, x)
    #>           return(x)
    #>         }
    #>   env:  0000000007FBBFA0
    

    Now, we can see that the function list being passed to the mutate call has already substituted the name of the column for the . variable. This means that, within the statement, there is a variable called x, y, or z the value of which comes from df.

    Imagine the simple case, we have:

    library(rlang)
    x <- 1:10
    quo(x)
    #> <quosure>
    #>   expr: ^x
    #>   env:  0000000007615318
    enquo(x)
    #> <quosure>
    #>   expr: ^<int: 1L, 2L, 3L, 4L, 5L, ...>
    #>   env:  empty
    

    From this, hopefully you can extrapolate why you want to use quo rather than enquo. You are after the column name, which is the name of the variable - given to you by quo.

    Thus, using quo instead of enquo and not assigning it to a variable first:

    mutate_if(df, is.numeric, funs({
      lookup_value <- df_lookup %>% pull(quo_name(quo(.)))
      ifelse(is.na(.), lookup_value, .)
    }))
    
    0 讨论(0)
  • 2020-12-15 12:15

    You have to use quo instead of enquo

    #enquo(.) :
    <quosure: empty>
    ~function (expr) 
    {
        enexpr(expr)
    }
    ...
    
    #quo(.) :
    <quosure: frame>
    ~x
    <quosure: frame>
    ~y
    <quosure: frame>
    ~z
    

    With your example :

    mutate_if(df, is.numeric, funs({
      lookup_value <- df_lookup %>% pull(quo_name(quo(.)))
      ifelse(is.na(.), lookup_value, .)
    }))
    
    # A tibble: 10 x 4
           x     y     z a    
       <int> <int> <int> <chr>
     1     1     1     8 a    
     2     2     2     2 b    
     3     3     3     3 c    
     4     4     5     8 d    
     5     5     1     8 e    
     6     6     2     2 a    
     7     7     3     3 b    
     8     8     5     8 c    
     9     9     1     8 d    
    10    10     2     2 e    
    
    0 讨论(0)
提交回复
热议问题