问题
I want to remove those words where the number of letters/special characters in a word occurs more than twice simultaneously.
For Eg the input is like
"Google in theee lland of whhhat c#, c++ and e###"
and the output should be
"Google in lland of c#, c++ and"
回答1:
x <- "Google in theee lland of whhhat c#, c++ and e###"
gsub("\\S*(\\S)\\1\\1\\S*\\s?", "", x)
# [1] "Google in lland of c#, c++ and "
(\\S)\\1\\1
finds sequences of three consecutive repetitions of a single non-space character.
The surrounding \\S*
and \\S*\\s?
just capture preceding and succeeding characters within the same word, as well as any single space immediately following the word.
来源:https://stackoverflow.com/questions/22888528/regex-to-remove-words-if-it-contains-a-letter-special-character-multiple-times-s