Why does Regexp have a timeout method, while in theory they shouldn't?

大兔子大兔子 提交于 2020-01-01 19:51:11

问题


This is a theoretical Computer Science question (Computation Theory).

I know that RegExps can take a very long time to calculate. However, from Theory of Computation we know that matching with a Regular Expression can be done extremely fast in a few clock cycles.

If RegExps are equivalent to Finite Automata, why RegExps have (or require) a timeout method? Using a DFA, the computation time for matching can be exteremely fast.

By RegExps I mean the Regular Expressions pattern matching classes in major languages; JavaScript, C#, etc.

Are common RegExps ("regex"s) not equivalent to the Regular Expressions in Theory of Automata (i.e. Regular Languages)?

For examples see: How do I timeout Regex operations to prevent hanging in .NET 4.5? and Regex Pattern Catastrophic backtracking .

If Regexp's matching require Backtracking, it means they are not equivalent to Regular Expressions.

If the languages captured by "Regexp"s are not Regular Languages, historically why (out of which necessity) were they extended?

If it that the resulting DFA will require a huge set of states?


回答1:


A good reason is catastrophic backtracking, which explains why matching of some regexes will not return before the heat death of the universe.




回答2:


Because regex are not equivalent to the Regular Expressions in Theory of Automata.

They are more like cousins with extra functionalities that make them more complex and sometimes (depending on the regex) impossible to execute on long strings.




回答3:


(out of which necessity) were they extended?

Regexp implementations were extended in systems in which the lack of a regexp feature requires difficult workarounds, such as writing a substantial amount of code in an inexpressive programming language. There is also the grave risk that the code might turn out to be correct, performant and robust against false positive matches.



来源:https://stackoverflow.com/questions/57731835/why-does-regexp-have-a-timeout-method-while-in-theory-they-shouldnt

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!