Regular expression that allows letters (like “ñ”) from any language

笑着哭i 提交于 2019-12-07 04:46:41

问题


trying to let users use special characters in other languages such as Spanish or French. I originally had this:

 "/[^A-Za-z0-9\.\_\- ]/i" 

and then changed it to

 "/[^\p{L}\p{N}\.\_\-\(\) ]/i" 

but still doesn't work. letters such as "ñ" should be allowed. Thanks.

Revision: I found that adding a (*UTF8) at the beginning helps solve the problem. So I'm using the following code:"/(*UTF8)[^\p{L}A-Za-z0-9._- ]/i"

Revision: After looking at the answers I decided to use: "/[^\p{Xwd}. -]/u". Thanks(It works even with the Chinese alphabet.


回答1:


for latin languages you can use the \p{Latin} character class:

/[^\p{Latin}0-9._ -]/u

But if you want all other letters and digits:

/[^\p{Xwd}. -]/u

The "u" modifier indicates that the string must be read as an unicode string.




回答2:


You could also look into specifying a unicode range, ie. [\w\u00C0-\u024F.-]+ to include Latin extended letters. But it's hard to try and restrict characters to such a broad subset; what about Chinese, Vietnamese, etc.? I'm with Dagon on this one – best to allow anything.



来源:https://stackoverflow.com/questions/22052517/regular-expression-that-allows-letters-like-%c3%b1-from-any-language

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!