Is there any benefit in using compile for regular expressions in Python?
h = re.compile(\'hello\')
h.match(\'hello world\')
vs
Using the given examples:
h = re.compile('hello')
h.match('hello world')
The match method in the example above is not the same as the one used below:
re.match('hello', 'hello world')
re.compile() returns a regular expression object, which means h is a regex object.
The regex object has its own match method with the optional pos and endpos parameters:
regex.match(string[, pos[, endpos]])
pos
The optional second parameter pos gives an index in the string where the search is to start; it defaults to 0. This is not completely equivalent to slicing the string; the
'^'pattern character matches at the real beginning of the string and at positions just after a newline, but not necessarily at the index where the search is to start.
endpos
The optional parameter endpos limits how far the string will be searched; it will be as if the string is endpos characters long, so only the characters from pos to
endpos - 1will be searched for a match. If endpos is less than pos, no match will be found; otherwise, if rx is a compiled regular expression object,rx.search(string, 0, 50)is equivalent torx.search(string[:50], 0).
The regex object's search, findall, and finditer methods also support these parameters.
re.match(pattern, string, flags=0) does not support them as you can see,
nor does its search, findall, and finditer counterparts.
A match object has attributes that complement these parameters:
match.pos
The value of pos which was passed to the search() or match() method of a regex object. This is the index into the string at which the RE engine started looking for a match.
match.endpos
The value of endpos which was passed to the search() or match() method of a regex object. This is the index into the string beyond which the RE engine will not go.
A regex object has two unique, possibly useful, attributes:
regex.groups
The number of capturing groups in the pattern.
regex.groupindex
A dictionary mapping any symbolic group names defined by (?P) to group numbers. The dictionary is empty if no symbolic groups were used in the pattern.
And finally, a match object has this attribute:
match.re
The regular expression object whose match() or search() method produced this match instance.