Python regex \w doesn't match combining diacritics?
问题 I have a UTF8 string with combining diacritics. I want to match it with the \w regex sequence. It matches characters that have accents, but not if there is a latin character with combining diacritics. >>> re.match("a\w\w\wz", u"aoooz", re.UNICODE) <_sre.SRE_Match object at 0xb7788f38> >>> print u"ao\u00F3oz" aoóoz >>> re.match("a\w\w\wz", u"ao\u00F3oz", re.UNICODE) <_sre.SRE_Match object at 0xb7788f38> >>> re.match("a\w\w\wz", u"aoo\u0301oz", re.UNICODE) >>> print u"aoo\u0301oz" aóooz (Looks