How to match all unicode alphabetic characters and spaces in a regex?

问题

I am trying to validate place names in python 3/ django forms. I want to get matches with strings like: Los Angeles, Canada, 中国, and Россия. That is, the string contains:

spaces
alphabetic characters (from any language)
no numbers
no special characters (punctuation, symbols etc.)

The pattern I am currently using is r'^[^\W\d]+$' as suggested in How to match alphabetical chars without numeric chars with Python regexp?. However it only seems to match like the pattern r'^[a-zA-Z]+$. That is, Россия, Los Angeles and 中国 do not match , only Canada does.

An example of my code:

import re
re.search(r'^[^\W\d]+$', 'Россия')

Which returns nothing.

回答1:

Your example works for me, but will find underscores and not spaces. This works:

>>> re.search(r'^(?:[^\W\d_]| )+$', 'Los Angeles')
<_sre.SRE_Match object at 0x0000000003C612A0>
>>> re.search(r'^(?:[^\W\d_]| )+$', 'Россия')
<_sre.SRE_Match object at 0x0000000003A0D030>
>>> re.search(r'^(?:[^\W\d_]| )+$', 'Los_Angeles') # not found
>>> re.search(r'^(?:[^\W\d_]| )+$', '中国')
<_sre.SRE_Match object at 0x0000000003C612A0>

来源：https://stackoverflow.com/questions/34957154/how-to-match-all-unicode-alphabetic-characters-and-spaces-in-a-regex

标签

regex

python-3.x

unicode

django-forms

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!