问题
I'm trying to use gsub
in R to replace a bunch of weird characters in some strings I'm processing. Everything works, except whenever I throw in "]" it makes the whole thing do nothing. I'm using \\
like gsub("[\\?\\*\\]]", "", name)
but it's still not working. Here's my actual example:
name <- "R U Still Down? [Remember Me]"
what I want is: names
to be "R U Still Down Remember Me"
when I do:
names <- gsub("[\\(\\)\\*\\$\\+\\?'\\[]", "", name)
it semi-works and I get "R U Still Down Remember Me]"
but when I do:
names <- gsub("[\\(\\)\\*\\$\\+\\?'\\[\\]]", "", name)
nothing happens. (i.e. I get "R U Still Down? [Remember Me]"
)
Any ideas? I've tried switching around the order of things, etc. But I can't seem to figure it out.
回答1:
Just enable perl=TRUE
parameter.
> gsub("[?\\]\\[*]", "", name, perl=T)
[1] "R U Still Down Remember Me"
And escape only the needed characters.
> gsub("[()*$+?'\\[\\]]", "", name, perl=T)
[1] "R U Still Down Remember Me"
回答2:
You can switch the order of the character class without escaping.
name <- 'R U Still Down? [Remember Me][*[[]*'
gsub('[][?*]', '', name)
# [1] "R U Still Down Remember Me"
If you want to remove all punctuation characters, use the POSIX class [:punct:]
gsub('[[:punct:]]', '', name)
This class in the ASCII range matches all non-controls, non-alphanumeric, non-space characters.
ascii <- rawToChar(as.raw(0:127), multiple=T)
paste(ascii[grepl('[[:punct:]]', ascii)], collapse="")
# [1] "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"
来源:https://stackoverflow.com/questions/32041265/how-to-escape-closed-bracket-in-regex-in-r