How to compile multiple multiple regexes in one go? Is it more efficient? - python

心不动则不痛 提交于 2019-12-25 06:13:59

问题


Let's say I have a code as such:

import re
docid_re = re.compile(r'<DOCID>([^>]+)</DOCID>')
doctype_re = re.compile(r'<DOCTYPE SOURCE="[^"]+">([^>]+)</DOCTYPE>')
datetime_re = re.compile(r'<DATETIME>([^>]+)</DATETIME>')

I could also do this:

>>> import re
>>> docid_re = r'<DOCID>([^>]+)</DOCID>'
>>> doctype_re = r'<DOCTYPE SOURCE="[^"]+">([^>]+)</DOCTYPE>'
>>> datetime_re = r'<DATETIME>([^>]+)</DATETIME>'
>>> docid_re, doctype_re, datetime_re = map(re.compile, [docid_re, doctype_re, datetime_re])
>>> docid_re
<_sre.SRE_Pattern object at 0x7f0314eee438>

But is there any real gain in speed or memory when I use the map()?


回答1:


Do not listen anybody - just measure it! You can use timeit module for it. But remember, that "premature optimization is the root of all evil" (c) Donald Knuth.

Btw, answer on your question "No, it doesn't help at all".




回答2:


If you were compiling a lot of regexes, map might help by avoiding lookup costs involved in finding re, then getting its compile attribute each call; with map, you look up map once and re.compile once, and then it gets used over and over without further lookups. Of course, when you need to construct a list to use it, you eat into that savings. Practically speaking, you'd need an awful lot of regexes to reach the point where map would be worth your while; for three, it's probably a loss.

Even when it did help, it would be the tiniest of microoptimizations. I would do it if it made the code cleaner, performance is a tertiary concern here at best. There are cases (say, parsing a huge text file of integers into ints) where map can be a big win because the overhead of starting it up is compensated for by the reduced lookup and Python byte code execution overhead. But this is not one of those cases, and those cases are so rare as to not be worth worrying about 99.99% of the time.



来源:https://stackoverflow.com/questions/32873378/how-to-compile-multiple-multiple-regexes-in-one-go-is-it-more-efficient-pyth

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!