Remove everything before the last space

痞子三分冷 提交于 2019-12-01 12:30:30

Your gsub("\\s*","\\1",str) code replaces each occurrence of 0 or more whitespaces with a reference to the capturing group #1 value (which is an empty string since you have not specified any capturing group in the pattern).

You want to match up to the last whitespace:

sub(".*\\s", "", str)

If you do not want to get a blank result in case your string has trailing whitespace, trim the string first:

sub(".*\\s", "", trimws(str))

Or, use a handy stri_extract_last_regex from stringi package with a simple \S+ pattern (matching 1 or more non-whitespace chars):

library(stringi)
stri_extract_last_regex(str, "\\S+")
# => [1] "vici"

Note that .* matches any 0+ chars as many as possible (since * is a greedy quantifier and . in a TRE pattern matches any char including line break chars), and grabs the whole string at first. Then, backtracking starts since the regex engine needs to match a whitespace with \s. Yielding character by character from the end of the string, the regex engine stumbles on the last whitespace and calls it a day returning the match that is removed afterwards.

See the R demo and a regex demo online:

str <- c("Veni vidi vici")
gsub(".*\\s", "", str)
## => [1] "vici"

Also, you may want to see how backtracking works in the regex debugger:

Those red arrows show backtracking steps.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!