Regex match numbers not followed by a hyphen

元气小坏坏 提交于 2021-02-05 08:31:36

问题


I've worked with regexes for years and never had this much trouble working out a regex. I am using PHP 7.2.2's preg_match() to return an array of matching numbers for parsing, hence the parens in the regex.

I am trying to match one or more numbers followed by an "x" followed by one or more numbers where the entire string is not followed by a hyphen. When $input is "18x18" or "18x18-", the matches are 18 and 1. When $input is "8x8" there are no matches. I seem to be doing something fundamentally wrong here.

<?php
$input = "18x18";    
preg_match("/(\d+)x(\d+)[^-]/", $input, $matches);

Calling print_r($matches) results in:

Array
(
    [0] => 18x18
    [1] => 18
    [2] => 1
)

The parens are there because I am using PHP's preg_match to return an array of matches. I understand when hyphens should be escaped and I've tried both ways to be sure but get the same result. Why doesn't this match?


回答1:


You may use

'~(\d+)x(\d++)(?!-)~'

It can also be written without a possessive quantifier as '~(\d+)x(\d+)(?![-\d])~' since the \d inside the lookahead will also forbid matching the second digit chunk partially.

Alternatively, additionally to the lookahead, you may use word boundaries:

'~\b(\d+)x(\d+)\b(?!-)~'

See the regex demo #1 and regex demo #2.

Details

  • (\d+)x(\d++)(?!-) / (\d+)x(\d+)(?![-\d]) - matches and captures 1 or more digits into Group 1, then matches x, and then matches and captures into Group 2 one or more digits possessively without letting backtracking into the digit matching pattern, and the (?!-) negative lookahead check (making sure there is no - immediately after the current position) is performed once after \d++ matches all the digits it can. In case of \d+(?![-\d]), the 1+ digits are matched first, and then the negative lookahead makes sure there is no digit and - immediately to the right of the current location.
  • \b(\d+)x(\d+)\b(?!-) - matches a word boundary first, then matches and captures 1 or more digits into Group 1, then matches x, then matches and captures into Group 2 one or more digits, then asserts that there is a word boundary, and only then makes sure there is no - right after.

See a PHP demo:

if (preg_match('~(\d+)x(\d++)(?!-)~', "18x18", $m)) {
    echo "18x18: " . $m[1] . " - " . $m[2] . "\n";
}
if (preg_match('~\b(\d+)x(\d+)\b(?!-)~', "18x18", $m)) {
    echo "18x18: " . $m[1] . " - " . $m[2] . "\n";
}
if (preg_match('~(\d+)x(\d++)(?!-)~', "18x18-", $m)) {
    echo "18x18-: " . $m[1] . " - " . $m[2] . "\n";
}
if (preg_match('~\b(\d+)x(\d+)\b(?!-)~', "18x18-", $m)) {
    echo "18x18-: " . $m[1] . " - " . $m[2];
}

Output:

18x18: 18 - 18
18x18: 18 - 18


来源:https://stackoverflow.com/questions/51264400/regex-match-numbers-not-followed-by-a-hyphen

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!