Python string 'in' operator implementation algorithm and time complexity

余生长醉 提交于 2019-11-26 03:43:39

问题


I am thinking of how the in operator implement, for instance

>>> s1 = \'abcdef\'
>>> s2 = \'bcd\'
>>> s2 in s1
True

In CPython, which algorithm is used to implement the string match, and what is the time complexity? Is there any official document or wiki about this?


回答1:


It's a combination of Boyer-Moore and Horspool.

You can view the C code here:

Fast search/count implementation, based on a mix between Boyer-Moore and Horspool, with a few more bells and whistles on the top. For some more background, see: http://effbot.org/zone/stringlib.htm.

From the link above:

When designing the new algorithm, I used the following constraints:

  • should be faster than the current brute-force algorithm for all test cases (based on real-life code), including Jim Hugunin’s worst-case test
  • small setup overhead; no dynamic allocation in the fast path (O(m) for speed, O(1) for storage)
  • sublinear search behaviour in good cases (O(n/m))
  • no worse than the current algorithm in worst case (O(nm))
  • should work well for both 8-bit strings and 16-bit or 32-bit Unicode strings (no O(σ) dependencies)
  • many real-life searches should be good, very few should be worst case reasonably simple implementation


来源:https://stackoverflow.com/questions/18139660/python-string-in-operator-implementation-algorithm-and-time-complexity

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!