问题
I am trying to validate place names in python 3/ django forms. I want to get matches with strings like: Los Angeles
, Canada
, 中国
, and Россия
. That is, the string contains:
- spaces
- alphabetic characters (from any language)
- no numbers
- no special characters (punctuation, symbols etc.)
The pattern I am currently using is r'^[^\W\d]+$'
as suggested in How to match alphabetical chars without numeric chars with Python regexp?. However it only seems to match like the pattern r'^[a-zA-Z]+$
. That is, Россия
, Los Angeles
and 中国
do not match , only Canada
does.
An example of my code:
import re
re.search(r'^[^\W\d]+$', 'Россия')
Which returns nothing.
回答1:
Your example works for me, but will find underscores and not spaces. This works:
>>> re.search(r'^(?:[^\W\d_]| )+$', 'Los Angeles')
<_sre.SRE_Match object at 0x0000000003C612A0>
>>> re.search(r'^(?:[^\W\d_]| )+$', 'Россия')
<_sre.SRE_Match object at 0x0000000003A0D030>
>>> re.search(r'^(?:[^\W\d_]| )+$', 'Los_Angeles') # not found
>>> re.search(r'^(?:[^\W\d_]| )+$', '中国')
<_sre.SRE_Match object at 0x0000000003C612A0>
来源:https://stackoverflow.com/questions/34957154/how-to-match-all-unicode-alphabetic-characters-and-spaces-in-a-regex