问题
I have an input field which is localized. I need to add a validation using a regex that it must take only alphabets and numbers. I could have used [a-z0-9]
if I were using only English.
As of now, I am using the method Character.isLetterOrDigit(name.charAt(i))
(yes, I am iterating through each character) to filter out the alphabets present in various languages.
Are there any better ways of doing it? Any regex or other libraries available for this?
回答1:
Since Java 7 you can use Pattern.UNICODE_CHARACTER_CLASS
String s = "Müller";
Pattern p = Pattern.compile("^\\w+$", Pattern.UNICODE_CHARACTER_CLASS);
Matcher m = p.matcher(s);
if (m.find()) {
System.out.println(m.group());
} else {
System.out.println("not found");
}
with out the option it will not recognize the word "Müller", but using Pattern.UNICODE_CHARACTER_CLASS
Enables the Unicode version of Predefined character classes and POSIX character classes.
See here for more details
You can also have a look here for more Unicode information in Java 7.
and here on regular-expression.info an overview over the Unicode scripts, properties and blocks.
See here a famous answer from tchrist about the caveats of regex in Java, including an updated what has changed with Java 7 (of will be in Java 8)
回答2:
boolean foundMatch = name.matches("[\\p{L}\\p{Nd}]*");
should work.
[\p{L}\p{Nd}]
matches a character that is either a Unicode letter or digit. The regex .matches()
method ensures that the entire string matches the pattern.
回答3:
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
-- Jamie Zawinksi
I say this in jest, but iterating through the String like you are doing will have runtime performance at least as good as any regex — there's no way a regex can do what you want any faster; and you don't have the overhead of compiling a pattern in the first place.
So as long as:
- the validation doesn't need to do anything else regex-like (nothing was mentioned in the question)
- the intention of the code looping through the String is clear (and if not, refactor until it is)
then why replace it with a regex just because you can?
来源:https://stackoverflow.com/questions/9499851/regex-for-validating-alphabetics-and-numbers-in-the-localized-string