问题
class Test
{
public static void main (String[] args)
{
String regex = "\\p{L}";
System.out.println("0".matches(regex));
}
}
The code above prints false, but I was expecting true because isn't ASCII a subset of unicode? "0" is part of ASCII, so I think it should also belongs to a unicode letter.
Also, comma, period etc prints "false" true, while "a" will print true.
回答1:
It is because \\p{L}
matches a Unicode letter and you're matching a digit.
You can use:
[\\p{L}\\p{Nd}.,]
to match a Unicode digit or letter.
You should also use (?U)
in front of your regex for Unicode support like this:
String regex = "(?U)[\\p{L}\\p{Nd}.,]+";
来源:https://stackoverflow.com/questions/41846074/java-regex-why-numbers-0-9-comma-etc-is-not-an-unicode