What is the theory behind KMP pattern matching algorithm? [closed]

隐身守侯 提交于 2019-12-02 17:38:34

The KMP matching algorithm is based on finite automata and works by implicitly building the transition table for an automaton that matches the string. Using a very clever linear-time preprocessing of the string to search for, a matching automaton can be constructed, and the automaton can then be simulated on the string to search in in linear time. The net result is a linear-time algorithm for string matching.

The automaton that's constructed works by having one state for each amount of the string that has been matched so far. By default, the transitions are such that matching the next character advances to the next state in the machine and reading an invalid character transitions back to the beginning. However, certain pieces of the string to search for might share some overlapping structure, so some new transitions are added that take the automaton back to an earlier (but not the first) state when a character is read.

This algorithm is generalized by the Aho-Corasick algorithm, which builds a more complex automaton in order to search for multiple strings at once.

I have an implementation of this algorithm on my personal site that contains an extensive discussion of the actual details of how the algorithm works, including a correctness proof, more formal intuition behind the algorithm, and explanation of how to derive the algorithm from first principles. It took me a while to put together, but I hope that it might be a good resource to learn more about the algorithm.

Hope this helps!

Morris discovered the algorithm from first principles, but Knuth independently learned about a theorem due to Stephen A. Cook that deterministic two-way pushdown automata could be simulated in linear-time and extracted an early version of "KMP" by specializing the simulation to a string matching automaton. Pratt contributed an efficiency improvement. See Knuth's retelling.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!