Greedy vs. Reluctant vs. Possessive Quantifiers

前端 未结 7 1204
傲寒
傲寒 2020-11-21 07:19

I found this excellent tutorial on regular expressions and while I intuitively understand what \"greedy\", \"reluctant\" and \"possessive\" quantifiers do, there seems to be

相关标签:
7条回答
  • 2020-11-21 08:09

    Here is my take using Cell and Index positions (See the diagram here to distinguish a Cell from an Index).

    Greedy - Match as much as possible to the greedy quantifier and the entire regex. If there is no match, backtrack on the greedy quantifier.

    Input String: xfooxxxxxxfoo
    Regex: .*foo

    The above Regex has two parts:
    (i)'.*' and
    (ii)'foo'

    Each of the steps below will analyze the two parts. Additional comments for a match to 'Pass' or 'Fail' is explained within braces.

    Step 1:
    (i) .* = xfooxxxxxxfoo - PASS ('.*' is a greedy quantifier and will use the entire Input String)
    (ii) foo = No character left to match after index 13 - FAIL
    Match failed.

    Step 2:
    (i) .* = xfooxxxxxxfo - PASS (Backtracking on the greedy quantifier '.*')
    (ii) foo = o - FAIL
    Match failed.

    Step 3:
    (i) .* = xfooxxxxxxf - PASS (Backtracking on the greedy quantifier '.*')
    (ii) foo = oo - FAIL
    Match failed.

    Step 4:
    (i) .* = xfooxxxxxx - PASS (Backtracking on the greedy quantifier '.*')
    (ii) foo = foo - PASS
    Report MATCH

    Result: 1 match(es)
    I found the text "xfooxxxxxxfoo" starting at index 0 and ending at index 13.

    Reluctant - Match as little as possible to the reluctant quantifier and match the entire regex. if there is no match, add characters to the reluctant quantifier.

    Input String: xfooxxxxxxfoo
    Regex: .*?foo

    The above regex has two parts:
    (i) '.*?' and
    (ii) 'foo'

    Step 1:
    .*? = '' (blank) - PASS (Match as little as possible to the reluctant quantifier '.*?'. Index 0 having '' is a match.)
    foo = xfo - FAIL (Cell 0,1,2 - i.e index between 0 and 3)
    Match failed.

    Step 2:
    .*? = x - PASS (Add characters to the reluctant quantifier '.*?'. Cell 0 having 'x' is a match.)
    foo = foo - PASS
    Report MATCH

    Step 3:
    .*? = '' (blank) - PASS (Match as little as possible to the reluctant quantifier '.*?'. Index 4 having '' is a match.)
    foo = xxx - FAIL (Cell 4,5,6 - i.e index between 4 and 7)
    Match failed.

    Step 4:
    .*? = x - PASS (Add characters to the reluctant quantifier '.*?'. Cell 4.)
    foo = xxx - FAIL (Cell 5,6,7 - i.e index between 5 and 8)
    Match failed.

    Step 5:
    .*? = xx - PASS (Add characters to the reluctant quantifier '.*?'. Cell 4 thru 5.)
    foo = xxx - FAIL (Cell 6,7,8 - i.e index between 6 and 9)
    Match failed.

    Step 6:
    .*? = xxx - PASS (Add characters to the reluctant quantifier '.*?'. Cell 4 thru 6.)
    foo = xxx - FAIL (Cell 7,8,9 - i.e index between 7 and 10)
    Match failed.

    Step 7:
    .*? = xxxx - PASS (Add characters to the reluctant quantifier '.*?'. Cell 4 thru 7.)
    foo = xxf - FAIL (Cell 8,9,10 - i.e index between 8 and 11)
    Match failed.

    Step 8:
    .*? = xxxxx - PASS (Add characters to the reluctant quantifier '.*?'. Cell 4 thru 8.)
    foo = xfo - FAIL (Cell 9,10,11 - i.e index between 9 and 12)
    Match failed.

    Step 9:
    .*? = xxxxxx - PASS (Add characters to the reluctant quantifier '.*?'. Cell 4 thru 9.)
    foo = foo - PASS (Cell 10,11,12 - i.e index between 10 and 13)
    Report MATCH

    Step 10:
    .*? = '' (blank) - PASS (Match as little as possible to the reluctant quantifier '.*?'. Index 13 is blank.)
    foo = No character left to match - FAIL (There is nothing after index 13 to match)
    Match failed.

    Result: 2 match(es)
    I found the text "xfoo" starting at index 0 and ending at index 4.
    I found the text "xxxxxxfoo" starting at index 4 and ending at index 13.

    Possessive - Match as much as possible to the possessive quantifer and match the entire regex. Do NOT backtrack.

    Input String: xfooxxxxxxfoo
    Regex: .*+foo

    The above regex has two parts: '.*+' and 'foo'.

    Step 1:
    .*+ = xfooxxxxxxfoo - PASS (Match as much as possible to the possessive quantifier '.*')
    foo = No character left to match - FAIL (Nothing to match after index 13)
    Match failed.

    Note: Backtracking is not allowed.

    Result: 0 match(es)

    0 讨论(0)
提交回复
热议问题