Regular Expression for validating username

a 夏天 提交于 2020-01-06 01:12:52

问题


I'm looking for a regular expression to validate a username.

The username may contain:

  • Letters (western, greek, russian etc.)
  • Numbers
  • Spaces, but only 1 at a time
  • Special characters (for example: "!@#$%^&*.:;<>?/\|{}[]_+=-") , but only 1 at a time

EDIT:

Sorry for the confusion

  • I need it for cocoa-touch but i'll have to translate it for php for the server side anyway.
  • And with 1 at a time i mean spaces or special char's should be separated by letters or numbers.

回答1:


Instead of writing one big regular expression, it would be clearer to write separate regular expressions to test each of your desired conditions.

  • Test whether the username contains only letters, numbers, ASCII symbols ! through @, and space: ^(\p{L}|\p{N}|[!-@]| )+$. This must match for the username to be valid. Note the use of the \p{L} class for Unicode letters and the \p{N} class for Unicode numbers.

  • Test whether the the username contains consecutive spaces: \s\s+. If this matches, the username is invalid.

  • Test whether symbols occur consecutively: [!-@][!-@]+. If this matches, the username is invalid.

This satisfies your criteria exactly as written.

However, depending on how the usernames have been written, perfectly valid names like "Éponine" may still be rejected by this approach. This is because the "É" could be written either as U+00C9 LATIN CAPITAL E WITH ACUTE (which is matched by \p{L}) or something like E followed by U+02CA MODIFIER LETTER ACUTE ACCENT (which is not matched by \p{L}.)

Regular-Expressions.info says it better:

Again, "character" really means "Unicode code point". \p{L} matches a single code point in the category "letter". If your input string is à encoded as U+0061 U+0300, it matches a without the accent. If the input is à encoded as U+00E0, it matches à with the accent. The reason is that both the code points U+0061 (a) and U+00E0 (à) are in the category "letter", while U+0300 is in the category "mark".

Unicode is hairy, and restricting the characters in usernames is not necessarily a good idea anyway. Are you sure you want to do this?




回答2:


The expression

^(\w| (?! )|["!@#$%^&*.:;<>?/\|{}\[\]_+=\-")](?!["!@#$%^&*.:;<>?/\|{}\[\]_+=\-")]))*$

will mostly do what you want, if your dialect support look-ahead assertions. See it in action at RegExr.

Please ask yourself why you want to limit usernames in this way. Most of the time usernames starting with "!!" should be not an issue, and you annoy users if you reject their desired username.

Edit: \w does not match non-latin characters. To do this, replace \w with \p{L} wich may, or may not work depending on your regex implementation. Regexr unfortunately does not support it.




回答3:


Try this:

^[!@#$%^&*.:;<>?\/\|{}\[\]_+= -]?([\p{L}\d]+[!@#$%^&*.:;<>?/\|{}\[\]_+= -]?)+$

See on rubular




回答4:


You want something like

string strUserName = "BillYBob Stev#nS0&";
Regex regex = new Regex(@"(?i)\b(\w+\p{P}*\p{S}*\p{Z}*\p{C}*\s?)+\b");
Match match = regex.Match(strUserName);

If you want this explaining, let me know.

I hope this helps.

Note: This is case insensitive.




回答5:


Since I don't know in what language you need this solution, I am providing answer in Java. It can be translated in any other platform:

String str = "à123 àà@bcà#";
String regex = "^([\\p{L}\\d]+[!@#$%\\^&\\*.:;<>\\?/\\|{}\\[\\]_\\+=\\s-]?)+$";
Pattern p = Pattern.compile(regex);
matcher = p.matcher(str);
if (matcher.find())
   System.out.println("Matched: " + matcher.group());

One assumption I made is that username will start with either an unicode letter or a number.



来源:https://stackoverflow.com/questions/10298776/regular-expression-for-validating-username

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!