问题
Below is a somewhat contrived example that get's at my eventual question...
I would like to use regex to uppercase the first character of a possessive noun. Let's say that I have a regular expression that (probably poorly) matches possessive nouns. ie...
### Regex explanation ###
# 1. match a word break
# 2. match a word-like character and capture in a group
# 3. lookahead for some more characters + the substring "'s"
>>> my_re = re.compile(r"\b(\w)(?=\w+'s)")
>>> re.search(my_re, "bruce's computer")
<_sre.SRE_Match object; span=(0, 1), match='b'>
>>> re.search(my_re, "bruce's computer").group(1)
'b'
For this example, it works as expected. So, I think that all have to do is call upper on the first group in sub and it should work, right?
>>> re.sub(my_re, r'\1'.upper(), "bruce's computer")
"bruce's computer"
This is not expected or obvious why it is not capital. After some research I find this in the re.sub documentation...
Return the string obtained by replacing the leftmost non-overlapping occurrences of the pattern in string by the replacement repl. repl can be either a string or a callable; if a string, backslash escapes in it are processed. If it is a callable, it's passed the match object and must return a replacement string to be used.
Indeed passing a callable does work...
>>> re.sub(my_re, lambda x: x.group(1).upper(), "bruce's computer")
"Bruce's computer"
Great. What I would like to understand is why does this work otherwise I won't remember how use the API correctly for this type of instance without looking it up. Any direction would be appreciated.
回答1:
Second argument can be string or a callable.
re.sub(my_re, r'\1'.upper(), "bruce's computer")
: you're passing a \1
string to the sub
function (upper or not, doesn't matter)
re.sub(my_re, lambda x: x.group(1).upper(), "bruce's computer")
: you're passing a callable, so the upper()
works because it applies on the result.
x.group(1).upper()
isn't evaluated at once because it's contained in a lambda expression, equivalent to the non-lambda:
def func(x):
return x.group(1).upper()
that you could also pass to re.sub
: re.sub(my_re, func, "bruce's computer")
, note the lack of ()
in that case!
来源:https://stackoverflow.com/questions/41622102/why-do-i-have-to-pass-a-callable-to-re-sub-to-make-an-uppercase-string