Java regex look-behind group does not have obvious maximum length error

前端 未结 4 2097
情书的邮戳
情书的邮戳 2021-01-05 03:56

I know that java regex does not support varying length look-behinds, and that the following should cause an error

(?<=(not exceeding|no((\\\\w|\\\\s)*)mor         


        
4条回答
  •  失恋的感觉
    2021-01-05 04:28

    java regex does not support varying length look-behinds

    It is not totally true, Java supports limited variable length lookbehinds, example (?<=.{0,1000}) is allowed or something like (?<=ab?)c or (?<=abc|defgh).

    But if there is no limit at all, Java doesn't support it.

    So, what is not obvious for the java regex engine for a lookbehind subpattern:

    a {m,n} quantifier applyed to a non-fixed length subpattern:

    (?:abc){0,1} is allowed
    
    (?:ab?)?     is allowed
    (?:ab|de)    is allowed
    (?:ab|de)?   is allowed
    
    (?:ab?){0,1}   is not allowed
    (?:ab|de){1}   is not allowed
    (?:ab|de){0,1} is not allowed # in my opinion, it is because of the alternation.
                                  # When an alternation is detected, the analysis
                                  # stops immediatly
    

    To obtain this error message in this particular kind of cases, you need two criterae:

    • a potentially variable length subpattern (ie: that contains a quantifier, an alternation or a backreference)

    • and a {m,n} type quantifier.

    All these cases don't seem evident for the user, since it seems like an arbitrary choice. However, I think that the real reason is to limit the pre-analysis time of the pattern by the regex engine transmission.

提交回复
热议问题