问题
Regex \w doesn't match utf-8 characters in Ruby 1.9.2. Anybody faced same problem?
Example:
/[\w\s]+/u
In my rails application.rb I've added config.encoding = "utf-8"
回答1:
Define "doesn't match utf-8 characters"? If you expect \w to match anything other than exactly the uppercase and lowercase ASCII letters, the ASCII digits, and underscore, it won't -- Ruby has defined \w to be equivalent to [A-Za-z0-9_] regardless of Unicode. Maybe you want \p{Word} or something similar instead.
Ref: Ruby 1.9 Regexp documentation (see section "Character Classes").
回答2:
You could always use something like
[a-zA-Z0-9_ñáéíóú]
instead of \w
来源:https://stackoverflow.com/questions/3975894/regex-w-doesnt-process-utf-8-characters-in-ruby-1-9-2