python-re: How do I match an alpha character

后端未结

关注

 3  1543

借酒劲吻你 2020-11-30 08:42

How can I match an alpha character with a regular expression. I want a character that is in \\w but is not in \\d. I want it unicode compatible tha

3条回答

情深已故 (楼主)

2020-11-30 09:00

What about:

\p{L}

You can to use this document as reference: Unicode Regular Expressions

EDIT: Seems Python doesn't handle Unicode expressions. Take a look into this link: Handling Accented Characters with Python Regular Expressions -- [A-Z] just isn't good enough (no longer active, link to internet archive)

Another references:

re.UNICODE
python and regular expression with unicode
Unicode Technical Standard #18: Unicode Regular Expressions

For posterity, here are the examples on the blog:

import re
string = 'richÃ©'
print string
richÃ©

richre = re.compile('([A-z]+)')
match = richre.match(string)
print match.groups()
('rich',)

richre = re.compile('(\w+)',re.LOCALE)
match = richre.match(string)
print match.groups()
('rich',)

richre = re.compile('([Ã©\w]+)')
match = richre.match(string)
print match.groups()
('rich\xe9',)

richre = re.compile('([\xe9\w]+)')
match = richre.match(string)
print match.groups()
('rich\xe9',)

richre = re.compile('([\xe9-\xf8\w]+)')
match = richre.match(string)
print match.groups()
('rich\xe9',)

string = 'richÃ©Ã±'
match = richre.match(string)
print match.groups()
('rich\xe9\xf1',)

richre = re.compile('([\u00E9-\u00F8\w]+)')
print match.groups()
('rich\xe9\xf1',)

matched = match.group(1)
print matched
richÃ©Ã±

0 讨论(0)

查看其它3个回答