Is it worth using Python's re.compile?

前端 未结 26 2208
旧时难觅i
旧时难觅i 2020-11-22 12:51

Is there any benefit in using compile for regular expressions in Python?

h = re.compile(\'hello\')
h.match(\'hello world\')

vs



        
26条回答
  •  南旧
    南旧 (楼主)
    2020-11-22 13:03

    FWIW:

    $ python -m timeit -s "import re" "re.match('hello', 'hello world')"
    100000 loops, best of 3: 3.82 usec per loop
    
    $ python -m timeit -s "import re; h=re.compile('hello')" "h.match('hello world')"
    1000000 loops, best of 3: 1.26 usec per loop
    

    so, if you're going to be using the same regex a lot, it may be worth it to do re.compile (especially for more complex regexes).

    The standard arguments against premature optimization apply, but I don't think you really lose much clarity/straightforwardness by using re.compile if you suspect that your regexps may become a performance bottleneck.

    Update:

    Under Python 3.6 (I suspect the above timings were done using Python 2.x) and 2018 hardware (MacBook Pro), I now get the following timings:

    % python -m timeit -s "import re" "re.match('hello', 'hello world')"
    1000000 loops, best of 3: 0.661 usec per loop
    
    % python -m timeit -s "import re; h=re.compile('hello')" "h.match('hello world')"
    1000000 loops, best of 3: 0.285 usec per loop
    
    % python -m timeit -s "import re" "h=re.compile('hello'); h.match('hello world')"
    1000000 loops, best of 3: 0.65 usec per loop
    
    % python --version
    Python 3.6.5 :: Anaconda, Inc.
    

    I also added a case (notice the quotation mark differences between the last two runs) that shows that re.match(x, ...) is literally [roughly] equivalent to re.compile(x).match(...), i.e. no behind-the-scenes caching of the compiled representation seems to happen.

提交回复
热议问题