问题
I've worked with regexes for years and never had this much trouble working out a regex. I am using PHP 7.2.2's preg_match() to return an array of matching numbers for parsing, hence the parens in the regex.
I am trying to match one or more numbers followed by an "x" followed by one or more numbers where the entire string is not followed by a hyphen. When $input is "18x18" or "18x18-", the matches are 18 and 1. When $input is "8x8" there are no matches. I seem to be doing something fundamentally wrong here.
<?php
$input = "18x18";
preg_match("/(\d+)x(\d+)[^-]/", $input, $matches);
Calling print_r($matches) results in:
Array
(
[0] => 18x18
[1] => 18
[2] => 1
)
The parens are there because I am using PHP's preg_match to return an array of matches. I understand when hyphens should be escaped and I've tried both ways to be sure but get the same result. Why doesn't this match?
回答1:
You may use
'~(\d+)x(\d++)(?!-)~'
It can also be written without a possessive quantifier as '~(\d+)x(\d+)(?![-\d])~' since the \d inside the lookahead will also forbid matching the second digit chunk partially.
Alternatively, additionally to the lookahead, you may use word boundaries:
'~\b(\d+)x(\d+)\b(?!-)~'
See the regex demo #1 and regex demo #2.
Details
(\d+)x(\d++)(?!-)/(\d+)x(\d+)(?![-\d])- matches and captures 1 or more digits into Group 1, then matchesx, and then matches and captures into Group 2 one or more digits possessively without letting backtracking into the digit matching pattern, and the(?!-)negative lookahead check (making sure there is no-immediately after the current position) is performed once after\d++matches all the digits it can. In case of\d+(?![-\d]), the 1+ digits are matched first, and then the negative lookahead makes sure there is no digit and-immediately to the right of the current location.\b(\d+)x(\d+)\b(?!-)- matches a word boundary first, then matches and captures 1 or more digits into Group 1, then matchesx, then matches and captures into Group 2 one or more digits, then asserts that there is a word boundary, and only then makes sure there is no-right after.
See a PHP demo:
if (preg_match('~(\d+)x(\d++)(?!-)~', "18x18", $m)) {
echo "18x18: " . $m[1] . " - " . $m[2] . "\n";
}
if (preg_match('~\b(\d+)x(\d+)\b(?!-)~', "18x18", $m)) {
echo "18x18: " . $m[1] . " - " . $m[2] . "\n";
}
if (preg_match('~(\d+)x(\d++)(?!-)~', "18x18-", $m)) {
echo "18x18-: " . $m[1] . " - " . $m[2] . "\n";
}
if (preg_match('~\b(\d+)x(\d+)\b(?!-)~', "18x18-", $m)) {
echo "18x18-: " . $m[1] . " - " . $m[2];
}
Output:
18x18: 18 - 18
18x18: 18 - 18
来源:https://stackoverflow.com/questions/51264400/regex-match-numbers-not-followed-by-a-hyphen