Why non-greedy quantifier sometimes doesn't work in Oracle regex?

后端 未结 4 1011
后悔当初
后悔当初 2020-12-06 09:02

IMO, this query should return A=1,B=2,

SELECT regexp_substr(\'A=1,B=2,C=3,\', \'.*B=.*?,\') as A_and_B FROM dual

But it retu

4条回答
  •  萌比男神i
    2020-12-06 10:01

    Looking at the feedback, I hesitate to jump in, but here I go ;-)

    According to the Oracle docs, the *? and +? match a "preceding subexpression". For *? specifically:

    Matches zero or more occurrences of the preceding subexpression (nongreedyFootref 1). Matches the empty string whenever possible.

    To create a subexpression group, use parenthesis ():

    Treats the expression within the parentheses as a unit. The expression can be a string or a complex expression containing operators.

    You can refer to a subexpression in a back reference.

    This will allow you to use greedy and non-greedy (many alternating times actually) in the same regexp, with expected results. For your example:

    select regexp_substr('A=1,B=2,C=3,', '(.)*B=(.)*?,') from dual;
    

    To make the point a bit more clear (i hope), this example uses greedy and non-greedy in the same regexp_substr, with different (correct) results depending on where the ? is placed (it does NOT just use the rule for the first subexpression it sees). Also note that the subexpression (\w) will match alphanumerics and underscore only, not @.

    -- non-greedy followed by greedy 
    select regexp_substr('1_@_2_a_3_@_4_a', '(\w)*?@(\w)*') from dual;
    

    result: 1_@_2_a_3_

    -- greedy followed by non-greedy
    select regexp_substr('1_@_2_a_3_@_4_a', '(\w)*@(\w)*?') from dual;
    

    result: 1_@

提交回复
热议问题