Remove extra white space from between letters in R using gsub()

╄→尐↘猪︶ㄣ 提交于 2019-12-01 04:58:40

问题


There are a slew of answers out there on how to remove extra whitespace from between words, which is super simple. However, I'm finding that removing extra whitespace within words is much harder. As a reproducible example, let's say I have a vector of data that looks like this:

x <- c("L L C", "P O BOX 123456", "NEW YORK")

What I'd like to do is something like this:

y <- gsub("(\\w)(\\s)(\\w)(\\s)", "\\1\\3", x)

But that leaves me with this:

[1] "LLC" "POBOX 123456" "NEW YORK"

Almost perfect, but I'd really like to have that second value say "PO BOX 123456". Is there a better way to do this than what I'm doing?


回答1:


You may try this,

> x <- c("L L C", "P O BOX 123456", "NEW YORK")
> gsub("(?<=\\b\\w)\\s(?=\\w\\b)", "", x,perl=T)
[1] "LLC"           "PO BOX 123456" "NEW YORK" 

It just removes the space which exists between two single word characters.



来源:https://stackoverflow.com/questions/31280327/remove-extra-white-space-from-between-letters-in-r-using-gsub

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!