Can someone explain why Java\'s regex engine goes into catastrophic backtracking mode on this regex? Every alternation is mutually exclusive with every other alternation, fr
Can someone explain why java's regex engine goes into catastrophic mode on this regex?
For the string:
'pão de açúcar itaucard mastercard platinum SUSTENTABILIDADE])
It seems like this part of the regex would be the problem:
'(?:[^']+|'')+'
Matching the first '
then failing to match the closing '
and thus backtracking all combinations of the nested quantifiers.
If you allow the regex to backtrack, it will backtrack (when failing). Use atomic groups and/or possessive quantifiers to prevent that.
Btw, you do not need most of the escapes in that regex. Only thing you (could) need to escape in character classes ([]
) are the chars ^-]
. But usually you can position them so that they do not need to be escaped either. Of course the \
and whatever you are quoation the string with still needs to be (double) escaped.
"^(?:[^]['\"\\s~:/@#|^&(){}\\\\][^][\"\s~:/@#|^&(){}\\\\]*|\"(?:[^\"]++|\"\")++\"|'(?:[^']++|'')++')"