How to escape closed bracket “]” in regex in R

橙三吉。 提交于 2019-12-01 14:52:43

问题


I'm trying to use gsub in R to replace a bunch of weird characters in some strings I'm processing. Everything works, except whenever I throw in "]" it makes the whole thing do nothing. I'm using \\ like gsub("[\\?\\*\\]]", "", name) but it's still not working. Here's my actual example:

name <- "R U Still Down? [Remember Me]"

what I want is: names to be "R U Still Down Remember Me"

when I do: names <- gsub("[\\(\\)\\*\\$\\+\\?'\\[]", "", name) it semi-works and I get "R U Still Down Remember Me]"

but when I do: names <- gsub("[\\(\\)\\*\\$\\+\\?'\\[\\]]", "", name) nothing happens. (i.e. I get "R U Still Down? [Remember Me]")

Any ideas? I've tried switching around the order of things, etc. But I can't seem to figure it out.


回答1:


Just enable perl=TRUE parameter.

> gsub("[?\\]\\[*]", "", name, perl=T)
[1] "R U Still Down Remember Me"

And escape only the needed characters.

> gsub("[()*$+?'\\[\\]]", "", name, perl=T)
[1] "R U Still Down Remember Me"



回答2:


You can switch the order of the character class without escaping.

name <- 'R U Still Down? [Remember Me][*[[]*'
gsub('[][?*]', '', name)
# [1] "R U Still Down Remember Me"

If you want to remove all punctuation characters, use the POSIX class [:punct:]

gsub('[[:punct:]]', '', name)

This class in the ASCII range matches all non-controls, non-alphanumeric, non-space characters.

ascii <- rawToChar(as.raw(0:127), multiple=T)
paste(ascii[grepl('[[:punct:]]', ascii)], collapse="")
# [1] "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"


来源:https://stackoverflow.com/questions/32041265/how-to-escape-closed-bracket-in-regex-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!