Python regex - Replace bracketed text with contents of brackets

|▌冷眼眸甩不掉的悲伤 提交于 2021-02-07 17:54:08

问题


I'm trying to write a Python function that replaces instances of text surrounded with curly braces with the contents of the braces, while leaving empty brace-pairs alone. For example:

foo {} bar {baz} would become foo {} bar baz.

The pattern that I've created to match this is {[^{}]+}, i.e. some text that doesn't contain curly braces (to prevent overlapping matches) surrounded by a set of curly braces.

The obvious solution is to use re.sub with my pattern, and I've found that I can reference the matched text with \g<0>:

>>> re.sub("{[^{}]+}", "A \g<0> B", "foo {} bar {baz}")
'foo {} bar A {baz} B'

So that's no problem. However, I'm stuck on how to trim the brackets from the referenced text. If I try applying a range to the replacement string:

>>> re.sub("{[^{}]+}", "\g<0>"[1:-1], "foo{}bar{baz}")
'foo{}barg<0'

The range is applied before the \g<0> is resolved to the matched text, and it trims the leading \ and trailing >, leaving just g<0, which has no special meaning.

I also tried defining a function to perform the trimming:

def trimBraces(string):
    return string[1:-1]

But, unsurprisingly, that didn't change anything.

>>> re.sub("{[^{}]+}", trimBraces("\g<0>"), "foo{}bar{baz}")
'foo{}barg<0'

What am I missing here? Many thanks in advance.


回答1:


You can use a capturing group to replace a part of the match:

>>> re.sub(r"{([^{}]+)}", r"\1", "foo{}bar{baz}")
'foo{}barbaz'
>>> re.sub(r"{([^{}]+)}", r"\1", "foo {} bar {baz}")
'foo {} bar baz'



回答2:


When you use "\g<0>"[1:-1] as a replacement pattern, you only slice the "\g<0>" string, not the actual value this backreference refers to.

If you need to use your "trimming" approach, you need to pass the match data object to the re.sub:

re.sub("{[^{}]+}", lambda m: m.group()[1:-1], "foo{}bar{baz}")
# => foo{}barbaz

See this Python demo. Note that m.group() stands for the \g<0> in your pattern, i.e. the whole match value.

However, using capturing groups is a more "organic" solution, see alexce's solution.



来源:https://stackoverflow.com/questions/38734335/python-regex-replace-bracketed-text-with-contents-of-brackets

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!