Removing multiple commas and trailing commas using gsub

折月煮酒 提交于 2019-12-04 14:44:25

问题


This question is very similar to Removing multiple spaces and trailing spaces using gsub, except that I'd like to apply it to commas instead of spaces.

For example, I'd like a function TrimCommas to turn x into y:

x <- c("a,b,c", ",a,b,,c", ",,,a,,,b,c,,,")
# y <- TrimCommas(x) # presumably
y <- c("a,b,c", "a,b,c", "a,b,c")

The solution for spaces was gsub("^ *|(?<= ) | *$", "", x, perl=T), so I'm hoping comparing the solution for this will help explain some regex fundamentals as well.


回答1:


Isn't the solution pretty similar?

x <- c("a,b,c", ",a,b,,c", ",,,a,,,b,c,,,")
gsub("^,*|(?<=,),|,*$", "", x, perl=T)
# [1] "a,b,c" "a,b,c" "a,b,c"

There are three parts to the regex ^,*|(?<=,),|,*$:

  • ^,* -- this matches 0 or more commas at the beginning of the string
  • (?<=,), -- this is a positive lookbehind to see if there a comma behind a comma, so it matches , in ,,
  • ,*$ -- this matches 0 or more commas at the end of the string

As you can see all of the above are substituted with nothing.

You can make this generic to any character (" ", ",", etc.) with this function:

TrimMult <- function(x, char=" ") {
  return(gsub(paste0("^", char, "*|(?<=", char, ")", char, "|", char, "*$"),
              "", x, perl=T))
}


来源:https://stackoverflow.com/questions/23274035/removing-multiple-commas-and-trailing-commas-using-gsub

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!