Is it worth using Python's re.compile?

前端 未结 26 2200
旧时难觅i
旧时难觅i 2020-11-22 12:51

Is there any benefit in using compile for regular expressions in Python?

h = re.compile(\'hello\')
h.match(\'hello world\')

vs



        
26条回答
  •  被撕碎了的回忆
    2020-11-22 13:05

    Using the given examples:

    h = re.compile('hello')
    h.match('hello world')
    

    The match method in the example above is not the same as the one used below:

    re.match('hello', 'hello world')
    

    re.compile() returns a regular expression object, which means h is a regex object.

    The regex object has its own match method with the optional pos and endpos parameters:

    regex.match(string[, pos[, endpos]])

    pos

    The optional second parameter pos gives an index in the string where the search is to start; it defaults to 0. This is not completely equivalent to slicing the string; the '^' pattern character matches at the real beginning of the string and at positions just after a newline, but not necessarily at the index where the search is to start.

    endpos

    The optional parameter endpos limits how far the string will be searched; it will be as if the string is endpos characters long, so only the characters from pos to endpos - 1 will be searched for a match. If endpos is less than pos, no match will be found; otherwise, if rx is a compiled regular expression object, rx.search(string, 0, 50) is equivalent to rx.search(string[:50], 0).

    The regex object's search, findall, and finditer methods also support these parameters.

    re.match(pattern, string, flags=0) does not support them as you can see,
    nor does its search, findall, and finditer counterparts.

    A match object has attributes that complement these parameters:

    match.pos

    The value of pos which was passed to the search() or match() method of a regex object. This is the index into the string at which the RE engine started looking for a match.

    match.endpos

    The value of endpos which was passed to the search() or match() method of a regex object. This is the index into the string beyond which the RE engine will not go.


    A regex object has two unique, possibly useful, attributes:

    regex.groups

    The number of capturing groups in the pattern.

    regex.groupindex

    A dictionary mapping any symbolic group names defined by (?P) to group numbers. The dictionary is empty if no symbolic groups were used in the pattern.


    And finally, a match object has this attribute:

    match.re

    The regular expression object whose match() or search() method produced this match instance.

提交回复
热议问题