regular expression very slow on fail

ぐ巨炮叔叔 提交于 2020-01-02 06:53:29

问题


I've a regular expression that should validate if a string is composed by space-delimited strings. The regular expression works well (ok it allows a empty space in the end ... but that's not he problem) but takes too long when the validation fails.

The regular expression is the following:

/^(([\w\-]+)( )?){0,}$/

When trying to validate with the string

"'this-is_SAMPLE-scope-123,this-is_SAMPLE-scope-456'"

it takes 2 seconds.

The tests were performed in ruby 1.9.2-rc1 and 1.8.7. But this is probably a general problem.

Any idea?


回答1:


Your pattern causes catastrophic backtracking. The catastrophic part can be summarized to this:

(.+)*

The + and the * interacts in catastrophic ways in some engines.

It's unclear what you're trying to match, exactly, but it may be something like this:

^[\w\-]+( [\w\-]+)*$

This matches (as seen on rubular.com):

hello world
99 bottles of beer on the wall
this_works_too

and rejects:

not like this, not like this
hey what the &#@!
too many    spaces

Another option would be to use possessive quantifiers and/or atomic groupings in parts of the original pattern.

References

  • regular-expressions.info/Possessive quantifiers and Atomic grouping

Additional tips

The {0,} repetition is usually written simply as *. You can also use non-capturing groups to improve performance, i.e. (?:pattern).

References

  • regular-expressions.info/Brackets for Capturing and Repetition with Star and Plus

Related questions

  • Using explicitly numbered repetition instead of question mark, star and plus


来源:https://stackoverflow.com/questions/3212256/regular-expression-very-slow-on-fail

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!