R Ignore character within a Regex string

假如想象 提交于 2021-01-27 21:36:09

问题


I've looked all over for some regex that will cause R to disregard the next character within a regular expression string.

For example, given myvector:

 myvector <- c("abcdef", "ghijkl", "mnopqrs")

and a regex string:

 regexstring <- "[a-z]{3}XXXXXXXXX "

which includes some unknown characters XXXXXXXXX, I want to tell R to ignore the final space in the regular expression string itself.

After running the following,

regexstring <- "[a-z]{3} "
sub(regexstring, " ", myvector)

gives,

"abcdef"  "ghijkl"  "mnopqrs"

because there are no spaces in any of the strings. But hopefully after including XXXXXXXXX I will get the same output as if I had run

regexstring <- "[a-z]{3}"
sub(regexstring, " ", myvector)

which is:

 " def"  " jkl"  " pqrs"

I can't erase the final space or use trimws(), etc, and I don't see a way I can make R disregard the final space. Is there any XXXXXXXXX that does this? Thanks.


回答1:


The final space may be made a formatting space by using a (?x) free-spacing inline modifier in place of XXXs, and pass the perl=TRUE argument to make sure the pattern is parsed with the PCRE regex engine.

myvector <- c("abcdef", "ghijkl", "mnopqrs")
regexstring <- "[a-z]{3}(?x) "
sub(regexstring, " ", myvector, perl=TRUE) 
## => [1] " def"  " jkl"  " pqrs"

See the R demo.

Note that placing (?x) in the middle of the pattern will affect any literal whitespace that is used after (to the right) of the location in the pattern, either until the end of the pattern, or until the (?-x) modifier option.




回答2:


Building on Wiktor Stribizew's answer, I was able to figure out how to do this with stringr:

require(stringr)
myvector    <- c("abcdef", "ghijkl", "mnopqrs")
regexstring <- regex("[a-z]{3}# ", comments = T)
myvector %>% str_replace(regexstring, " ")

[1] " def"  " jkl"  " pqrs"

This way, I'm able to modify the regex string itself (regexstring) rather than the replacement command (sub or str_replace).



来源:https://stackoverflow.com/questions/47583220/r-ignore-character-within-a-regex-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!