r code removing words containing @

守給你的承諾、 提交于 2019-12-31 02:15:27

问题


I want to replace all words containing the symbol @ with a specific word. I am used gsub and therefore am applying it to a character vector. The issue that keeps occuring is that when I use:

gsub(".*@.*", "email", data) 

all of the text in that portion of the character vector gets deleted.

There are multiple different emails all with different lengths so I can't set the characters prior and characters after to a specific number.

Any suggestions?

I've done my fair share of reading about regex but everything I tried failed.

Here's an example:

data <- c("This is an example. Here is my email: emailaddress@help.com. Thank you")

data <- gsub(".*@.*", "email", data)

it returns [1] "email"

when I want [1] "This is an example. Here is my email: email. Thank you"


回答1:


You can use the following..

gsub('\\S+@\\S+', 'email', data)

Explanation:

\S matches any non-whitespace character. So here we match for any non-whitespace character (1 or more times) preceded by @ followed by any non-whitespace character (1 or more times)




回答2:


To replace strings with an embedded "@" in R, you can use (translaiting @Fabricator's pattern to R)

data <- c("This is an example. Here is my email: emailaddress@help.com")
gsub("[^\\s]*@[^\\s]*", "email", data, perl=T) 
data
# [1] "This is an example. Here is my email: email"


来源:https://stackoverflow.com/questions/24395382/r-code-removing-words-containing

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!