Documentation for ?: in regex?

后端 未结 6 715
日久生厌
日久生厌 2020-12-08 14:36

A while ago, I saw in regex (at least in PHP) you can make a capturing group not capture by prepending ?:.

Example

$str = \'big blue b         


        
相关标签:
6条回答
  • 2020-12-08 14:46

    (?:) as a whole represents a non-capturing group.

    Regular-expressions.info mentions this syntax :

    The question mark and the colon after the opening round bracket are the special syntax that you can use to tell the regex engine that this pair of brackets should not create a backreference. Note the question mark [...] is the regex operator that makes the previous token optional. This operator cannot appear after an opening round bracket, because an opening bracket by itself is not a valid regex token. Therefore, there is no confusion between the question mark as an operator to make a token optional, and the question mark as a character to change the properties of a pair of round brackets. The colon indicates that the change we want to make is to turn off capturing the backreference.

    0 讨论(0)
  • 2020-12-08 14:58

    PHP's preg_match_all uses the PCRE (Perl-Compatible Regular Expression) syntax, which is documented here. Non-capturing subpatterns are documented in the Subpatterns chapter.

    would leave me to believe it has something to do with lookaheads or lookbehinds.

    Nope, there are lots of different features which are triggered by open-bracket-question-mark. Lookahead/lookbehind is just the first one you met.

    It's messy that many options have to be squeezed into (?, instead of given a more readable syntax of their own, but it was necessary to fit everything into a sequence that was previously not a valid expression in itself, in older variants of regex.

    0 讨论(0)
  • 2020-12-08 14:59

    I don't know how do this with ?:, but it is easy with simple loop:

    $regex = '/b(ig|all)/';
    $array = array(
        0 => array(0 => 'big', 1 => 'ball'),
        1 => array(0 => 'ig', 1 => 'all')
    );
    foreach ($array as $key => $row) {
        foreach ($row as $val) {
            if (!preg_match($regex, $val)) {
                unset($array[$key]);
            }
        }
    }
    print_r($array);
    
    0 讨论(0)
  • 2020-12-08 15:00

    It's in the php manual, and I believe any other near-complete regular expression section for any language…

    The fact that plain parentheses fulfill two functions is not always helpful. There are often times when a grouping subpattern is required without a capturing requirement. If an opening parenthesis is followed by "?:", the subpattern does not do any capturing, and is not counted when computing the number of any subsequent capturing subpatterns.

    Source

    0 讨论(0)
  • 2020-12-08 15:05

    It's available on the Subpatterns page of the official documentation.

    The fact that plain parentheses fulfill two functions is not always helpful. There are often times when a grouping subpattern is required without a capturing requirement. If an opening parenthesis is followed by "?:", the subpattern does not do any capturing, and is not counted when computing the number of any subsequent capturing subpatterns. For example, if the string "the white queen" is matched against the pattern the ((?:red|white) (king|queen)) the captured substrings are "white queen" and "queen", and are numbered 1 and 2. The maximum number of captured substrings is 99, and the maximum number of all subpatterns, both capturing and non-capturing, is 200.

    It's also good to note that you can set options for the subpattern with it. For example, if you want only the sub-pattern to be case insensitive, you can do:

    (?i:foo)bar
    

    Will match:

    • foobar
    • Foobar
    • FoObar
    • ...etc

    But not

    • fooBar
    • FooBAR
    • ...etc

    Oh, and while the official documentation doesn't actually explicitly name the syntax, it does refer to it later on as a "non-capturing subpattern" (which makes complete sense, and is what I would call it anyway, since it's not really a "group", but a subpattern)...

    0 讨论(0)
  • 2020-12-08 15:06

    Here's what I've found:

    If you do not use the backreference, you can optimize this regular expression into Set(?:Value)?. The question mark and the colon after the opening round bracket are the special syntax that you can use to tell the regex engine that this pair of brackets should not create a backreference. Note the question mark after the opening bracket is unrelated to the question mark at the end of the regex. That question mark is the regex operator that makes the previous token optional. This operator cannot appear after an opening round bracket, because an opening bracket by itself is not a valid regex token. Therefore, there is no confusion between the question mark as an operator to make a token optional, and the question mark as a character to change the properties of a pair of round brackets. The colon indicates that the change we want to make is to turn off capturing the backreference.

    http://www.regular-expressions.info/brackets.html

    0 讨论(0)
提交回复
热议问题