grepl in R to find matches to any of a list of character strings

∥☆過路亽.° 提交于 2019-11-28 06:58:54

You can use an "or" (|) statement inside the regular expression of grepl.

ifelse(grepl("dog|cat", data$animal), "keep", "discard")
# [1] "keep"    "keep"    "discard" "keep"    "keep"    "keep"    "keep"    "discard"
# [9] "keep"    "keep"    "keep"    "keep"    "keep"    "keep"    "discard" "keep"   
#[17] "discard" "keep"    "keep"    "discard" "keep"    "keep"    "discard" "keep"   
#[25] "keep"    "keep"    "keep"    "keep"    "keep"    "keep"    "keep"    "keep"   
#[33] "keep"    "discard" "keep"    "discard" "keep"    "discard" "keep"    "keep"   
#[41] "keep"    "keep"    "keep"    "keep"    "keep"    "keep"    "keep"    "keep"   
#[49] "keep"    "discard"

The regular expression dog|cat tells the regular expression engine to look for either "dog" or "cat", and return the matches for both.

Not sure what you tried but this seems to work:

data$keep <- ifelse(grepl(paste(matches, collapse = "|"), data$animal), "Keep","Discard")

Similar to the answer you linked to.

The trick is using the paste:

paste(matches, collapse = "|")
#[1] "cat|dog"

So it creates a regular expression with either dog OR cat and would also work with a long list of patterns without typing each.

Edit:

In case you are doing this to later on subset the data.frame according to "Keep" and "Discard" entries, you could do this more directly using:

data[grepl(paste(matches, collapse = "|"), data$animal),]

This way, the results of grepl which are TRUE or FALSE are used for the subset.

Try to avoid ifelse as much as possible. This, for example, works nicely

c("Discard", "Keep")[grepl("(dog|cat)", data$animal) + 1]

For a 123 seed you will get

##  [1] "Keep"    "Keep"    "Discard" "Keep"    "Keep"    "Keep"    "Discard" "Keep"   
##  [9] "Discard" "Discard" "Keep"    "Discard" "Keep"    "Discard" "Keep"    "Keep"   
## [17] "Keep"    "Keep"    "Keep"    "Keep"    "Keep"    "Keep"    "Keep"    "Keep"   
## [25] "Keep"    "Keep"    "Discard" "Discard" "Keep"    "Keep"    "Keep"    "Keep"   
## [33] "Keep"    "Keep"    "Keep"    "Discard" "Keep"    "Keep"    "Keep"    "Keep"   
## [41] "Keep"    "Discard" "Discard" "Keep"    "Keep"    "Keep"    "Keep"    "Discard"
## [49] "Keep"    "Keep"   
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!