Unicode equivalents for \w and \b in Java regular expressions?

后端 未结 3 1700
梦毁少年i
梦毁少年i 2020-11-22 06:25

Many modern regex implementations interpret the \\w character class shorthand as \"any letter, digit, or connecting punctuation\" (usually: underscore). That wa

3条回答
  •  感动是毒
    2020-11-22 06:36

    It's really unfortunate that \w doesn't work. The proposed solution \p{Alpha} doesn't work for me either.

    It seems [\p{L}] catches all Unicode letters. So the Unicode equivalent of \w should be [\p{L}\p{Digit}_].

提交回复
热议问题