问题
Greetings,
I am developing GWT application where user can enter his details in Japanese. But the 'userid' and 'password' should only contain English characters(Latin Alphabet). How to validate Strings for this?
回答1:
You can use String#matches() with a bit regex for this. Latin characters are covered by \w.
So this should do:
boolean valid = input.matches("\\w+");
This by the way also covers numbers and the underscore _. Not sure if that harms. Else you can just use [A-Za-z]+ instead.
If you want to cover diacritical characters as well (ä, é, ò, and so on, those are per definition also Latin characters), then you need to normalize them first and get rid of the diacritical marks before matching, simply because there's no (documented) regex which covers diacriticals.
String clean = Normalizer.normalize(input, Form.NFD).replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
boolean valid = clean.matches("\\w+");
Update: there's an undocumented regex in Java which covers diacriticals as well, the \p{L}.
boolean valid = input.matches("\\p{L}+");
Above works at Java 1.6.
回答2:
public static boolean isValidISOLatin1 (String s) {
return Charset.forName("US-ASCII").newEncoder().canEncode(s);
} // or "ISO-8859-1" for ISO Latin 1
For reference, see the documentation on Charset.
回答3:
There might be a better approach, but you could load a collection with whatever you deem to be acceptable characters, and then check each character in the username/password field against that collection.
Pseudo:
foreach (character in username)
{
if !allowedCharacters.contains(character)
{
throw exception
}
}
回答4:
For something this simple, I'd use a regular expression.
private static final Pattern p = Pattern.compile("\\p{Alpha}+");
static boolean isValid(String input) {
Matcher m = p.matcher(input);
return m.matches();
}
There are other pre-defined classes like \w that might work better.
回答5:
There is my solution and it is working excellent
public static boolean isStringContainsLatinCharactersOnly(final String iStringToCheck)
{
return iStringToCheck.matches("^[a-zA-Z0-9.]+$");
}
回答6:
I successfully used a combination of the answers of user232624, Joachim Sauer and Tvaroh:
static CharsetEncoder asciiEncoder = Charset.forName("US-ASCII"); // or "ISO-8859-1" for ISO Latin 1
boolean isValid(String input) {
return Character.isLetter(ch) && asciiEncoder.canEncode(username);
}
来源:https://stackoverflow.com/questions/1911902/check-string-whether-it-contains-only-latin-characters