Remove all characters after the 3rd occurrence of “-” in each element of a vector

痞子三分冷 提交于 2019-12-06 11:41:58

问题


I am not that good with regular expressions in R. I would like to remove all characters after the 3rd occurrence of "-" in each element of a vector.

 Initial string  
 aa-bbb-cccc    =>    aa-bbb
 aa-vvv-vv      =>    aa-vvv
 aa-ddd         =>    aa-ddd

Any help?


回答1:


Judging by the sample input and expected output, I assume you need to remove all beginning with the 2nd hyphen.

You may use

sub("^([^-]*-[^-]*).*", "\\1", x)

See the regex demo

Details:

  • ^ - start of string
  • ([^-]*-[^-]*) - Group 1 capturing 0+ chars other than -, - and 0+ chars other than -
  • .* - any 0+ chars (in a TRE regex like this, a dot matches line break chars, too.)

The \\1 (\1) is a backreference to the text captured into Group 1.

R demo:

x <- c("aa-bbb-cccc", "aa-vvv-vv", "aa-ddd")
sub("^([^-]*-[^-]*).*", "\\1", x)
## => [1] "aa-bbb" "aa-vvv" "aa-ddd"


来源:https://stackoverflow.com/questions/41622326/remove-all-characters-after-the-3rd-occurrence-of-in-each-element-of-a-vecto

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!