regex to remove words if it contains a letter/special character multiple times simultaneously in R

别来无恙 提交于 2019-12-07 09:45:55

问题


I want to remove those words where the number of letters/special characters in a word occurs more than twice simultaneously.

For Eg the input is like

"Google in theee lland of whhhat c#, c++ and e###"

and the output should be

"Google in lland of c#, c++ and"

回答1:


x <- "Google in theee lland of whhhat c#, c++ and e###"
gsub("\\S*(\\S)\\1\\1\\S*\\s?", "", x)
# [1] "Google in lland of c#, c++ and "

(\\S)\\1\\1 finds sequences of three consecutive repetitions of a single non-space character.

The surrounding \\S* and \\S*\\s? just capture preceding and succeeding characters within the same word, as well as any single space immediately following the word.



来源:https://stackoverflow.com/questions/22888528/regex-to-remove-words-if-it-contains-a-letter-special-character-multiple-times-s

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!