Simple AlphaNumeric Regex (single spacing) without Catastrophic Backtracking

被刻印的时光 ゝ 提交于 2019-11-26 03:45:36

问题


I have the following REGEX expression (that works) to allow Alpha-Numeric (as well as \' and -) and no double spacing:

  ^([a-zA-Z0-9\'-]+\\s?)*$

Due to the nested grouping, this allows Catastrophic Backtracking to happen - which is bad!

How can I simplify this expression to avoid Catastrophic Backtracking?? (Ideally this wouldn\'t allow white-space in first and last characters either)


回答1:


Explanation

Nested group doesn't automatically causes catastrophic backtracking. In your case, it is because your regex degenerates to the classical example of catastrophic backtracking (a*)*.

Since \s in optional in ^([a-zA-Z0-9'-]+\s?)*$, on input without any spaces but has characters outside the allowed list, the regex simply degenerates to ^([a-zA-Z0-9'-]+)*$.

You can also think in term of expansion of the original regex:

[a-zA-Z0-9'-]+\s?[a-zA-Z0-9'-]+\s?[a-zA-Z0-9'-]+\s?[a-zA-Z0-9'-]+\s?...

Since \s is optional, we can remove it:

[a-zA-Z0-9'-]+[a-zA-Z0-9'-]+[a-zA-Z0-9'-]+[a-zA-Z0-9'-]+...

And we got a series of consecutive [a-zA-Z0-9'-]+, which will try all ways to distribute the characters between themselves and blow up the complexity.

Solution

The standard way to write a regex to match token delimiter token ... delimiter token is token (delimiter token)*. While it is possible to rewrite the regex avoid repeating token, I'd recommend against it, since it is harder to get it right. To avoid repetition , you might want to construct the regex by string concatenation instead.

Following the recipe above:

^[a-zA-Z0-9'-]+(\s[a-zA-Z0-9'-]+)*$

Although you can see repetition in repetition here, there is no catastrophic backtracking, since the regex can only expand to:

[a-zA-Z0-9'-]+\s[a-zA-Z0-9'-]+\s[a-zA-Z0-9'-]+\s[a-zA-Z0-9'-]+...

And \s and [a-zA-Z0-9'-] are mutual exclusive - there is only one way to match any string.



来源:https://stackoverflow.com/questions/27237579/simple-alphanumeric-regex-single-spacing-without-catastrophic-backtracking

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!