Regex - match a character and all its diacritic variations (aka accent-insensitive)
I am trying to match a character and all its possible diacritic variations (aka accent-insensitive) with a regular expression. What I could do of course is: re.match(r"^[eēéěèȅêęëėẹẽĕȇȩę̋ḕḗḙḛḝė̄]$", "é") but that is not a general solution. If I use unicode categories like \pL I can't reduce the match to a specific character, in this case e . A workaround to achieve the desired goal would be to use unidecode to get rid of all diacritics first, and then just match agains the regular e re.match(r"^e$", unidecode("é")) Or in this simplified case unidecode("é") == "e" Another solution which doesn't