问题
Consider the following:
$text = 'c c++ c# and other text';
$skills = array('c','c++','c#','java',...);
foreach ($skill as $skill) {
if (preg_match('/\b'.$skill.'\b/', $text)) {
echo $skill.' is matched';
}
}
In the case of 'c', it matches 'c', 'c#', and 'c++'. I've tried appending assertion (?=\s) or [\s|.] in place of \b towards the end but it needs something similar to \b.
I've checked out other posts but doesn't seem to have the exact situation. Thanks!
回答1:
The problem is that \b matches between c and + or #. You need something like this:
$text = 'c c++ c# and other text';
$skills = array('c','c++','c#','java');
foreach ($skills as $skill) {
if (preg_match('/(?<=^|\s)'.preg_quote($skill).'(?:\s|$)/', $text)) {
echo $skill.' is matched';
}
}
This matches when the text is preceded by either the start of the string (^) or a space at the beginning, and followed by either the end of the string ($) or a space at the end.
You need to use preg_quote(), like I did above, because c++ contains regex special characters.
Also, note the typo (missing s) in foreach ($skills ... ) in your original code.
回答2:
Part of the problem is that c++ has regex chars in it. You should use preg_quote on $skill. Then use your back and forward reference solution.
The other issue is that you need to double escape the special characters because php also uses \ as an escape character in strings.
来源:https://stackoverflow.com/questions/20560303/php-how-to-set-regex-boundary-and-not-match-non-alphanumeric-char