What's a good regex to include accented characters in a simple way?

后端 未结 4 1125
死守一世寂寞
死守一世寂寞 2020-12-14 02:12

Right now my regex is something like this:

[a-zA-Z0-9] but it does not include accented characters like I would want to. I would also like - \' , to be included.

4条回答
  •  孤街浪徒
    2020-12-14 02:54

    Accented Characters: DIY Character Range Subtraction

    If your regex engine allows it (and many will), this will work:

    (?i)^(?:(?![×Þß÷þø])[-'0-9a-zÀ-ÿ])+$
    

    Please see the demo (you can add characters to test).

    Explanation

    • (?i) sets case-insensitive mode
    • The ^ anchor asserts that we are at the beginning of the string
    • (?:(?![×Þß÷þø])[-'0-9a-zÀ-ÿ]) matches one character...
    • The lookahead (?![×Þß÷þø]) asserts that the char is not one of those in the brackets
    • [-'0-9a-zÀ-ÿ] allows dash, apostrophe, digits, letters, and chars in a wide accented range, from which we need to subtract
    • The + matches that one or more times
    • The $ anchor asserts that we are at the end of the string

    Reference

    Extended ASCII Table

提交回复
热议问题