Regex for validating alphabetics and numbers in the localized string

巧了我就是萌 提交于 2019-12-18 02:43:19

问题


I have an input field which is localized. I need to add a validation using a regex that it must take only alphabets and numbers. I could have used [a-z0-9] if I were using only English.

As of now, I am using the method Character.isLetterOrDigit(name.charAt(i)) (yes, I am iterating through each character) to filter out the alphabets present in various languages.

Are there any better ways of doing it? Any regex or other libraries available for this?


回答1:


Since Java 7 you can use Pattern.UNICODE_CHARACTER_CLASS

String s = "Müller";

Pattern p = Pattern.compile("^\\w+$", Pattern.UNICODE_CHARACTER_CLASS);
Matcher m = p.matcher(s);
if (m.find()) {
    System.out.println(m.group());
} else {
    System.out.println("not found");
}

with out the option it will not recognize the word "Müller", but using Pattern.UNICODE_CHARACTER_CLASS

Enables the Unicode version of Predefined character classes and POSIX character classes.

See here for more details

You can also have a look here for more Unicode information in Java 7.

and here on regular-expression.info an overview over the Unicode scripts, properties and blocks.

See here a famous answer from tchrist about the caveats of regex in Java, including an updated what has changed with Java 7 (of will be in Java 8)




回答2:


boolean foundMatch = name.matches("[\\p{L}\\p{Nd}]*");

should work.

[\p{L}\p{Nd}] matches a character that is either a Unicode letter or digit. The regex .matches() method ensures that the entire string matches the pattern.




回答3:


Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.

-- Jamie Zawinksi

I say this in jest, but iterating through the String like you are doing will have runtime performance at least as good as any regex — there's no way a regex can do what you want any faster; and you don't have the overhead of compiling a pattern in the first place.

So as long as:

  • the validation doesn't need to do anything else regex-like (nothing was mentioned in the question)
  • the intention of the code looping through the String is clear (and if not, refactor until it is)

then why replace it with a regex just because you can?



来源:https://stackoverflow.com/questions/9499851/regex-for-validating-alphabetics-and-numbers-in-the-localized-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!