Regex java. Why using intersection?

徘徊边缘 提交于 2019-12-29 07:49:10

问题


I have taken from this oracle tutorial on java regex, the following bit:

Intersections

To create a single character class matching only the characters common to all of its nested classes, use &&, as in [0-9&&[345]]. This particular intersection creates a single character class matching only the numbers common to both character classes: 3, 4, and 5.

Enter your regex: [0-9&&[345]] Enter input string to search: 3 I found the text "3" starting at index 0 and ending at index 1.

Why would it be useful? I mean if one wants to pattern only 345 why not only [345] instead of "the intersection"?

Thanks in advance.


回答1:


Let us consider a simple problem: match English consonants in a string. Listing out all consonants (or a list of ranges) would be one way:

[B-DF-HJ-NP-TV-Zb-df-hj-np-tv-z]

Another way is to use look-around:

(?=[A-Za-z])[^AEIOUaeiou]
(?![AEIOUaeiou])[A-Za-z]

Not sure if there is any other way to do this without the use of character class intersection.

Character class intersection solution (Java):

[A-Za-z&&[^AEIOUaeiou]]

For .NET, there is no intersection, but there is character class subtraction:

[A-Za-z-[AEIOUaeiou]]

I don't know the implementation details, but I wouldn't be surprised if character class intersection/subtraction is faster than the use of look-around, which is the cleanest alternative if character class operation is not available.

Another possible usage is when you have a pre-built character class and you want to remove some characters from it. One case that I have come across where class intersection might be applicable would be to match all whitespace characters, except for new line.

Another possible use case as @beerbajay has commented:

I think the built-in character classes are the main use case, e.g. [\p{InGreek}&&\p{Ll}] for lowercase Greek letters.



来源:https://stackoverflow.com/questions/15930181/regex-java-why-using-intersection

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!