Why do I have to pass a callable to re.sub to make an uppercase string?

非 Y 不嫁゛ 提交于 2021-02-19 03:34:46

问题


Below is a somewhat contrived example that get's at my eventual question...

I would like to use regex to uppercase the first character of a possessive noun. Let's say that I have a regular expression that (probably poorly) matches possessive nouns. ie...

### Regex explanation ###
# 1. match a word break
# 2. match a word-like character and capture in a group
# 3. lookahead for some more characters + the substring "'s"

>>> my_re = re.compile(r"\b(\w)(?=\w+'s)")
>>> re.search(my_re, "bruce's computer")
<_sre.SRE_Match object; span=(0, 1), match='b'>

>>> re.search(my_re, "bruce's computer").group(1)
'b'

For this example, it works as expected. So, I think that all have to do is call upper on the first group in sub and it should work, right?

>>> re.sub(my_re, r'\1'.upper(), "bruce's computer")
"bruce's computer"

This is not expected or obvious why it is not capital. After some research I find this in the re.sub documentation...

Return the string obtained by replacing the leftmost non-overlapping occurrences of the pattern in string by the replacement repl. repl can be either a string or a callable; if a string, backslash escapes in it are processed. If it is a callable, it's passed the match object and must return a replacement string to be used.

Indeed passing a callable does work...

>>> re.sub(my_re, lambda x: x.group(1).upper(), "bruce's computer")
"Bruce's computer"

Great. What I would like to understand is why does this work otherwise I won't remember how use the API correctly for this type of instance without looking it up. Any direction would be appreciated.


回答1:


Second argument can be string or a callable.

re.sub(my_re, r'\1'.upper(), "bruce's computer"): you're passing a \1 string to the sub function (upper or not, doesn't matter)

re.sub(my_re, lambda x: x.group(1).upper(), "bruce's computer"): you're passing a callable, so the upper() works because it applies on the result.

x.group(1).upper() isn't evaluated at once because it's contained in a lambda expression, equivalent to the non-lambda:

def func(x):
   return x.group(1).upper()

that you could also pass to re.sub: re.sub(my_re, func, "bruce's computer"), note the lack of () in that case!



来源:https://stackoverflow.com/questions/41622102/why-do-i-have-to-pass-a-callable-to-re-sub-to-make-an-uppercase-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!