Remove characters preceding first instance of a capital letter in string in R

落花浮王杯 提交于 2019-12-01 17:27:41

问题


I'm trying to remove all characters preceding the first instance of a capital letter for each string in a vector of strings:

x <- c(" its client Auto Group",  "itself and Phone Company", ", client Large Bank")

I've tried:

sub('.*?[A-Z]', '', x) 

But that returns:

"uto Group"  "hone Company"   "arge Bank"

I need it to return:

"Auto Group"    "Phone Company" "Large Bank"

Any ideas?

Thanks.


回答1:


You need to use a capturing group with a backreference:

sub("^.*?([A-Z])", "\\1", x)

Here,

  • ^ - start of the string
  • .*? - any 0+ characters as few as possible
  • ([A-Z]) - Capture group 1 capturing an uppercase ASCII letter that will be referenced with \1 in the replacement pattern.

So, what we restore what we captured in the result with a backreference.



来源:https://stackoverflow.com/questions/37842146/remove-characters-preceding-first-instance-of-a-capital-letter-in-string-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!