Replace special characters with ASCII equivalent

前端 未结 6 1822
星月不相逢
星月不相逢 2020-12-08 10:16

Is there any lib that can replace special characters to ASCII equivalents, like:

\"Cześć\"

to:

\"Czesc\"

6条回答
  •  失恋的感觉
    2020-12-08 10:48

    You can get most of the way by doing:

    import unicodedata
    
    def strip_accents(text):
        return ''.join(c for c in unicodedata.normalize('NFKD', text) if unicodedata.category(c) != 'Mn')
    

    Unfortunately, there exist accented Latin letters that cannot be decomposed into an ASCII letter + combining marks. You'll have to handle them manually. These include:

    • Æ → AE
    • Ð → D
    • Ø → O
    • Þ → TH
    • ß → ss
    • æ → ae
    • ð → d
    • ø → o
    • þ → th
    • Œ → OE
    • œ → oe
    • ƒ → f

提交回复
热议问题