Is there any benefit in using compile for regular expressions in Python?
h = re.compile(\'hello\')
h.match(\'hello world\')
vs
Performance difference aside, using re.compile and using the compiled regular expression object to do match (whatever regular expression related operations) makes the semantics clearer to Python run-time.
I had some painful experience of debugging some simple code:
compare = lambda s, p: re.match(p, s)
and later I'd use compare in
[x for x in data if compare(patternPhrases, x[columnIndex])]
where patternPhrases is supposed to be a variable containing regular expression string, x[columnIndex] is a variable containing string.
I had trouble that patternPhrases did not match some expected string!
But if I used the re.compile form:
compare = lambda s, p: p.match(s)
then in
[x for x in data if compare(patternPhrases, x[columnIndex])]
Python would have complained that "string does not have attribute of match", as by positional argument mapping in compare, x[columnIndex] is used as regular expression!, when I actually meant
compare = lambda p, s: p.match(s)
In my case, using re.compile is more explicit of the purpose of regular expression, when it's value is hidden to naked eyes, thus I could get more help from Python run-time checking.
So the moral of my lesson is that when the regular expression is not just literal string, then I should use re.compile to let Python to help me to assert my assumption.