python-re: How do I match an alpha character

后端 未结 3 1539
借酒劲吻你
借酒劲吻你 2020-11-30 08:42

How can I match an alpha character with a regular expression. I want a character that is in \\w but is not in \\d. I want it unicode compatible tha

3条回答
  •  情深已故
    2020-11-30 09:00

    What about:

    \p{L}
    

    You can to use this document as reference: Unicode Regular Expressions

    EDIT: Seems Python doesn't handle Unicode expressions. Take a look into this link: Handling Accented Characters with Python Regular Expressions -- [A-Z] just isn't good enough (no longer active, link to internet archive)

    Another references:

    • re.UNICODE
    • python and regular expression with unicode
    • Unicode Technical Standard #18: Unicode Regular Expressions

    For posterity, here are the examples on the blog:

    import re
    string = 'riché'
    print string
    riché
    
    richre = re.compile('([A-z]+)')
    match = richre.match(string)
    print match.groups()
    ('rich',)
    
    richre = re.compile('(\w+)',re.LOCALE)
    match = richre.match(string)
    print match.groups()
    ('rich',)
    
    richre = re.compile('([é\w]+)')
    match = richre.match(string)
    print match.groups()
    ('rich\xe9',)
    
    richre = re.compile('([\xe9\w]+)')
    match = richre.match(string)
    print match.groups()
    ('rich\xe9',)
    
    richre = re.compile('([\xe9-\xf8\w]+)')
    match = richre.match(string)
    print match.groups()
    ('rich\xe9',)
    
    string = 'richéñ'
    match = richre.match(string)
    print match.groups()
    ('rich\xe9\xf1',)
    
    richre = re.compile('([\u00E9-\u00F8\w]+)')
    print match.groups()
    ('rich\xe9\xf1',)
    
    matched = match.group(1)
    print matched
    richéñ
    

提交回复
热议问题