How the preg_match handles the delimiter when \Q..\E used?

只谈情不闲聊 提交于 2019-12-18 09:02:46

问题


I'm playing with regular expressions and I tried the \Q..\E escape sequence.

First try:

$regex = '/\Q http:// \E/';
var_dump(preg_match($regex, ' http:// '));

It tells me that '\' is unknown modifier, completely understandable.

Second try:

$regex = '/\Q http:\/\/ \E/';
var_dump(preg_match($regex, ' http:// '));
var_dump(preg_match($regex, ' http:\/\/ '));

It runs, not match the first string, but match the second one.

I know that I could use other delimiter character or solve it without \Q..\E, but I'm curious that how it works.

I through that at first it separates the regex from the modifiers by the delimiter (with handling the escaping if necessary) and after that the regex engine interprets the \Q..\E, but it seems like that when the \Q involved, then it not handles the escaped delimiter the same way.

What happens exactly at this case?

Thanks!


回答1:


\Q and \E can be used to ignore regular expression metacharacters in the pattern.

If the literal string contains the delimiter / for example, the regular expression compile fails or the match fails because it tries to match the escape character used to escape the delimiter. Delimiters that are between \Q \E should be treated as literal characters, not delimiters.

preg_match('~\Q http:// \E~', ' http:// ', $match);
var_dump($match);

# => array(1) { [0]=> string(7) " http:// " }

Use preg_quote() instead of \Q \E if the delimiter may appear within \Q \E

$text = ' http:// ';

preg_match('/' . preg_quote($text, '/') . '/', $text, $match);
var_dump($match);

# => array(1) { [0]=> string(9) " http:// " }



回答2:


Why the \Q...\E syntax doesn't support the delimiter?

Since the \E is optional, the only way to know where the pattern ends is the delimiter. This is the reason why it is an exception with the \Q..\E syntax.




回答3:


As the definition:

Character: \Q...\E 

Description: Matches the characters between \Q and \E literally, suppressing the meaning of special characters

Special characters as stated here and here are:

. \ + * ? [ ^ ] $ ( ) { } = ! < > | : -

Each non-alphanumeric, non-backslash, non-whitespace character that's used as delimiter is not considered as special characters class but just delimiters, so we will find that:

var_dump(preg_match("+\Q+\E+", "+"));

will throw a similar error:

Warning: preg_match(): Unknown modifier '\'




回答4:


The regex delimiter (/ in this case) appears to be handled weirdly inside the literal string (\Q ... \E). It's a delimiter, so you need to escape it so preg can parse out the regex from the options. But it appears this does not stop \Q and \E from doing their job of interpreting the backslash slash sequence as a literal, rather than an escape.

If you use a different delimiter, things work as expected:

var_dump(preg_match('@\Q http:// \E@', ' http:// '));


来源:https://stackoverflow.com/questions/20518758/how-the-preg-match-handles-the-delimiter-when-q-e-used

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!