Regex “\w” doesn't process utf-8 characters in Ruby 1.9.2

|▌冷眼眸甩不掉的悲伤 提交于 2020-01-24 12:41:49

问题


Regex \w doesn't match utf-8 characters in Ruby 1.9.2. Anybody faced same problem?

Example:

/[\w\s]+/u

In my rails application.rb I've added config.encoding = "utf-8"


回答1:


Define "doesn't match utf-8 characters"? If you expect \w to match anything other than exactly the uppercase and lowercase ASCII letters, the ASCII digits, and underscore, it won't -- Ruby has defined \w to be equivalent to [A-Za-z0-9_] regardless of Unicode. Maybe you want \p{Word} or something similar instead.

Ref: Ruby 1.9 Regexp documentation (see section "Character Classes").




回答2:


You could always use something like

[a-zA-Z0-9_ñáéíóú] 

instead of \w



来源:https://stackoverflow.com/questions/3975894/regex-w-doesnt-process-utf-8-characters-in-ruby-1-9-2

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!