Dplyr Filter Multiple Like Conditions

瘦欲@ 提交于 2020-01-11 14:13:09

问题


I am trying to do a filter in dplyr where a column is like certain observations. I can use sqldf as

Test <- sqldf("select * from database 
          Where SOURCE LIKE '%ALPHA%' 
          OR SOURCE LIKE '%BETA%' 
          OR SOURCE LIKE '%GAMMA%'")

I tried to use the following which doesn't return any results:

database %>% dplyr::filter(SOURCE %like% c('%ALPHA%', '%BETA%', '%GAMMA%'))

Thanks


回答1:


You can use grepl with ALPHA|BETA|GAMMA, which will match if any of the three patterns is contained in SOURCE column.

database %>% filter(grepl('ALPHA|BETA|GAMMA', SOURCE))

If you want it to be case insensitive, add ignore.case = T in grepl.




回答2:


%like% is from the data.table package. You're probably also seeing this warning message:

Warning message:
In grepl(pattern, vector) :
  argument 'pattern' has length > 1 and only the first element will be used

The %like% operator is just a wrapper around the grepl function, which does string matching using regular expressions. So % aren't necessary, and in fact they represent literal percent signs.

You can only supply one pattern to match at a time, so either combine them using the regex 'ALPHA|BETA|GAMMA' (as Psidom suggests) or break the tests into three statements:

database %>%
  dplyr::filter(
    SOURCE %like% 'ALPHA' |
      SOURCE %like% 'BETA' |
      SOURCE %like% 'GAMMA'
    )


来源:https://stackoverflow.com/questions/47057246/dplyr-filter-multiple-like-conditions

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!