Using filter_ in dplyr where both field and value are in variables

后端未结

关注

 4  1440

I want to filter a dataframe using a field which is defined in a variable, to select a value that is also in a variable. Say I have

df <- data.frame(V=c(6


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  半阙折子戏        
                
              
                            
                2020-12-15 10:04
              
            
            
                                                                       
Now, with rlang 0.4.0, it introduces a new more intuitive way for this type of use case:

packageVersion("rlang")
# [1] ‘0.4.0’

df <- data.frame(V=c(6, 1, 5, 3, 2), Unhappy=c("N", "Y", "Y", "Y", "N"))
fld <- "Unhappy"
sval <- "Y"

df %>% filter(.data[[fld]]==sval)

#OR
filter_col_val <- function(df, fld, sval) {
  df %>% filter({{fld}}==sval)
}

filter_col_val(df, Unhappy, "Y")


More information can be found at https://www.tidyverse.org/articles/2019/06/rlang-0-4-0/

Previous Answer

With dplyr 0.6.0 and later, this code works:

packageVersion("dplyr")
# [1] ‘0.7.1’

df <- data.frame(V=c(6, 1, 5, 3, 2), Unhappy=c("N", "Y", "Y", "Y", "N"))
fld <- "Unhappy"
sval <- "Y"

df %>% filter(UQ(rlang::sym(fld))==sval)

#OR
df %>% filter((!!rlang::sym(fld))==sval)

#OR
fld <- quo(Unhappy)
sval <- "Y"
df %>% filter(UQ(fld)==sval)


More about the dplyr syntax available at http://dplyr.tidyverse.org/articles/programming.html and the quosure usage in the rlang package https://cran.r-project.org/web/packages/rlang/index.html .

If you find it challenging mastering non-standard evaluation in dplyr 0.6+, Alex Hayes has an excellent writing-up on the topic: https://www.alexpghayes.com/blog/gentle-tidy-eval-with-examples/

Original Answer

With dplyr version 0.5.0 and later, it is possible to use a simpler syntax and gets closer to the syntax @Ricky originally wanted, which I also find more readable than using lazyeval::interp

df %>% filter_(.dots = paste0(fld, "=='", sval, "'"))

#  V Unhappy
#1 1       Y
#2 5       Y
#3 3       Y

#OR
df %>% filter_(.dots = glue::glue("{fld}=='{sval}'"))

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  情书的邮戳        
                
              
                            
                2020-12-15 10:09
              
            
            
                                                                       
You can try with interp from lazyeval

 library(lazyeval)
 library(dplyr)
 df %>%
     filter_(interp(~v==sval, v=as.name(fld)))
 #   V Unhappy
 #1 1       Y
 #2 5       Y
 #3 3       Y


For multiple key/value pairs, I found this to be working but I think a better way should be there.

  df1 %>% 
    filter_(interp(~v==sval1[1] & y ==sval1[2], 
           .values=list(v=as.name(fld1[1]), y= as.name(fld1[2]))))
 #  V Unhappy Col2
 #1 1       Y    B
 #2 5       Y    B


For these cases, I find the base R option to be easier.  For example, if we are trying to filter the rows based on the 'key' variables in 'fld1' with corresponding values in 'sval1', one option is using Map.  We subset the dataset (df1[fld1]) and apply the FUN (==) to each column of df1[f1d1] with corresponding value in 'sval1' and use the & with Reduce to get a logical vector that can be used to filter the rows of 'df1'.

 df1[Reduce(`&`, Map(`==`, df1[fld1],sval1)),]
 #   V Unhappy Col2
 # 2 1       Y    B
  #3 5       Y    B


data

df1 <- cbind(df, Col2= c("A", "B", "B", "C", "A"))
fld1 <- c(fld, 'Col2')
sval1 <- c(sval, 'B')    

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  遥遥无期        
                
              
                            
                2020-12-15 10:15
              
            
            
                                                                       
Here's an alternative with base R, which is maybe not very elegant, but it might have the benefit of being rather easily understandable:

df[df[colnames(df)==fld]==sval,]
#  V Unhappy
#2 1       Y
#3 5       Y
#4 3       Y

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤街浪徒        
                
              
                            
                2020-12-15 10:15
              
            
            
                                                                       
Following on from LmW; personally I prefer using a dplyr pipeline where the dots are specified before the pipeline so that it is easier to use programmatically, say in a loop of filters.

dots <-  paste0(fld," == '",sval,"'")
df   %>% filter_(.dots = dots)


LmW's example is correct but the values are hardcoded.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复