问题
I am not that good with regular expressions in R. I would like to remove all characters after the 3rd occurrence of "-" in each element of a vector.
Initial string
aa-bbb-cccc => aa-bbb
aa-vvv-vv => aa-vvv
aa-ddd => aa-ddd
Any help?
回答1:
Judging by the sample input and expected output, I assume you need to remove all beginning with the 2nd hyphen.
You may use
sub("^([^-]*-[^-]*).*", "\\1", x)
See the regex demo
Details:
^
- start of string([^-]*-[^-]*)
- Group 1 capturing 0+ chars other than-
,-
and 0+ chars other than-
.*
- any 0+ chars (in a TRE regex like this, a dot matches line break chars, too.)
The \\1
(\1
) is a backreference to the text captured into Group 1.
R demo:
x <- c("aa-bbb-cccc", "aa-vvv-vv", "aa-ddd")
sub("^([^-]*-[^-]*).*", "\\1", x)
## => [1] "aa-bbb" "aa-vvv" "aa-ddd"
来源:https://stackoverflow.com/questions/41622326/remove-all-characters-after-the-3rd-occurrence-of-in-each-element-of-a-vecto