问题
wanted to match the non-latin char. tried it. as per my understanding if (a.matches("[\\x8A-\\xFF]+"))
should return true but its false.
String a = "Ž";
if (a.matches("[\\x8A-\\xFF]+"))
{
}
回答1:
Judging from your title:
Regex to match non-latin char with ASCII 0-31 and 128-255
it seems you're after all characters except those in range 32-127 and you're surprised Ž doesn't match.
If this is correct, I suggest you use the expression [^\x20-\x7F]
("all characters except those in range 32-127"). This does match Ž.
(An exact translation of the regex in your title would look like [\x00-\x1F\x80-\xFF]
but this still doesn't match Ž as described below.)
Why your initial attempt didn't work:
The \xNN
matches characters unicode values. The unicode value for Ž is 0x017D, i.e. it falls outside of the range \x8A
-\xFF
.
When you say "Ž" is 8E you're most likely seeing a value from an extended ASCII table, and these are not the values that the Java regex engine works with.
来源:https://stackoverflow.com/questions/30500028/regex-to-match-non-latin-char-with-ascii-0-31-and-128-255