Partial String Match in R using the %in% operator?

和自甴很熟 提交于 2021-01-28 12:20:15

问题


I'm curious to know if it is possible to do partial string matches using the %in% operator in R. I know that there are many ways to use stringr, etc. to find partial string matches, but my current code works easier using the %in% operator.

For instance, imagine this vector:

x <- c("Withdrawn", "withdrawn", "5-Withdrawn", "2-WITHDRAWN", "withdrawnn")

I want each of these to be TRUE because the string contains "Withdrawn", but only the first is TRUE:

x %in% c("Withdrawn")
[1]  TRUE FALSE FALSE FALSE FALSE

I tried using regex to at least make it case insensitive, but that made everything false:

x %in% c("(?i)Withdrawn")
[1] FALSE FALSE FALSE FALSE FALSE

So, is it possible to yield TRUE on all of these using the %in% operator with maybe a wrapper? Because it's easy to use tolower() or toupper(), I'm not as concerned with the case sensitivity; however, it is important to me that the code would trigger "withdrawn", "withdrawnn", and "5-withdrawn".

EDIT: This question was marked as a duplicate of this question Case-insensitive search of a list in R; however, it is different because it is asking if partial string matches are possible using the %in% operator. The linked question does not use the %in% operator at all.


回答1:


%in% does not support this: It’s a wrapper for the match function, which uses equality comparison to establish matches, not regular expression matching. However, you can implement your own:

`%rin%` = function (pattern, list) {
     vapply(pattern, function (p) any(grepl(p, list)), logical(1L), USE.NAMES = FALSE)
}

And this can be used like %in%:

〉'^foo.*' %rin% c('foo', 'foobar')
[1] TRUE

Note that the result differs from your requirement to work as you’d expect from grepl: pattern matching is asymmetric, you can’t swap the left and right-hand side. If you just want to match a list against a single regular expression, use grepl directly:

〉grepl("(?i)Withdrawn", x)
[1] TRUE TRUE TRUE TRUE TRUE

Or, if you prefer using an operator:

`%matches%` = grepl
〉"(?i)Withdrawn" %matches% x
[1] TRUE TRUE TRUE TRUE TRUE


来源:https://stackoverflow.com/questions/56649988/partial-string-match-in-r-using-the-in-operator

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!