Why preg_match(“/[^(22|75)]/”, “25”) returns false?

被刻印的时光 ゝ 提交于 2021-02-05 12:19:30

问题


I want to test that a given string does not belong to the following group of strings: 22 75.

Could anyone please tell why PHP's preg_match("/[^(22|75)]/", "25") returns 0?

The weirdest thing is that preg_match("/[^(22|76)]/", "25") returns 1 as expected...

Edit: I guess I understand the reason and the nature of my mistake, not how to make a check that a given two-digit number does not match 20,21,22,23,24, 75,76,77,78,79,80 ? I need to assemble an expression to check a given age against the list of allowed ages (this presumes only two-digit numbers)

I can not use anything other than preg_match() (!preg_match() is not available in my case), I can only play with RegEx pattern.


回答1:


Time for a Regular Expressions Lesson!


Explanation of your regular expressions

[^(22|75)]

Matches false because it is looking for the following:

  • A single character NOT in this list of characters: |()275

[^(22|76)]

Matches true because it is looking for:

  • A single character NOT in this list of characters: |()276

Why does it do this?

You wrapped your regex in a character class (click for more info)

To give an example of how character classes work, look at this regex:

[2222222222222221111111199999999999]

This character class will only match ONE character, if it is a 2,1 or a 9.

How to make it work for you:

To match the number 25 (or 22, 52, and 55), you can use this character class:

[25]{2}

This will match a 2 digit number containing either 2 or 5 at either place.




回答2:


What are character classes

A character class is a collection of characters (not strings). With a character class, you're telling the regex engine to match only one out of several characters.

For example, if you wanted to match an a or e, you'd write [ae]. If you wanted to match grey or gray, you'd write gr[ae]y.

Explanation for first regex

[^(22|75)]

As said above, character classes match a single character from the list. Here, you're using ^ to get a negated character class, so this will match a single character that's not in the supplied list. In this case, our list contains the following characters:

( 2 2 | 7 5 )

Multiple characters are only counted once. So this effectively becomes:

( 2 | 7 5 )

25 is the string you're matching against. The regular expression asks: Does the supplied string contains a single character that's not in the above list? 2 and 5 are in the list, so the answer is No. That explains why preg_match() returns false (not false, 0 to be precise).

Explanation for second regex

/[^(22|76)]/

It is same as above. The only difference here is that 5 changed to 6. It now checks for the absense of any of the following characters:

( 2 | 7 6 )

The supplied string is still the same as before - 25. Does the string contain any character that's not in the list above? Yes! It does contain 5 (which is not in the list anymore). That explains why preg_match() returns 1.

Difference between character classes and alternation

They look similar but they do different things. Alternation can be used when you want to match a single regular expression out of several possible regular expressions. Unlike character classes, alternation works with a regex. A simple string, say foo is also a valid regular expression. It matches f followed by o, followed by o.

Use character class when you want to match one of the included characters. Use alternation when you want to match between n number of strings.

How should you modify the regex to obtain correct results

Negate your preg_match() call and use the regex (22|75):

if (!preg_match('/(22|75)/', '25')) {
    # code...
}

This is the easiest approach. If you want to achieve this directly using a regex, then you may want to use look-arounds.

Alternative solution

If this is exactly what you're trying to do, then you don't need a regular expression at all. Leverage PHP's built-in functions for string manipulation! Not only it will be faster, it will be more readable too.

In this case, a simple in_array() should suffice:

if(!in_array('25', array(25,75))) {
    # code ...
}



回答3:


In regular expression, [...] match any character inside the bracket.

To be more correct:

  • [^...]: match any charcter not listed inside the bracket. (^: negate)

Remove the [, and ] if you want to match string that starts with 22 or 76.




回答4:


Your regex is asking "does the string contain a character that is not (, 2, 7, 5, | or )?"

This is obviously not what you want.

Try this:

if( !in_array("25", array("22","75")))



回答5:


^ inside of [...] is a negation of a character list. (22|76)

Regex multiple character negation is a very tricky subject and can't be easily resolved.

But you could invert the return result of preg_match function ie.:

if(!preg_match('@22|76@', '25', $matches))
    doSomething();


来源:https://stackoverflow.com/questions/21881626/why-preg-match-2275-25-returns-false

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!