Remove all text between two brackets

后端 未结 5 1404
礼貌的吻别
礼貌的吻别 2020-11-27 21:23

Suppose I have some text like this,

text<-c(\"[McCain]: We need tax policies that respect the wage earners and job creators. [Obama]: It\'s harder to save         


        
5条回答
  •  囚心锁ツ
    2020-11-27 21:31

    No need to use a PCRE regex with a negated character class / bracket expression, a "classic" TRE regex will work, too:

    subject <- "Some [string] here and [there]"
    gsub("\\[[^][]*]", "", subject)
    ## => [1] "Some  here and "
    

    See the online R demo

    Details:

    • \\[ - a literal [ (must be escaped or used inside a bracket expression like [[] to be parsed as a literal [)
    • [^][]* - a negated bracket expression that matches 0+ chars other than [ and ] (note that the ] at the start of the bracket expression is treated as a literal ])
    • ] - a literal ] (this character is not special in both PCRE and TRE regexps and does not have to be escaped).

    If you want to only replace the square brackets with some other delimiters, use a capturing group with a backreference in the replacement pattern:

    gsub("\\[([^][]*)\\]", "{\\1}", subject)
    ## => [1] "Some {string} here and {there}"
    

    See another demo

    The (...) parenthetical construct forms a capturing group, and its contents can be accessed with a backreference \1 (as the group is the first one in the pattern, its ID is set to 1).

提交回复
热议问题