Is there any lib that can replace special characters to ASCII equivalents, like:
\"Cześć\"
to:
\"Czesc\"
The package unidecode worked best for me:
from unidecode import unidecode
text = "Björn, Łukasz and Σωκράτης."
print(unidecode(text))
# ==> Bjorn, Lukasz and Sokrates.
You might need to install the package:
pip install unidecode
The above solution is easier and more robust than encoding (and decoding) the output of unicodedata.normalize(), as suggested by other answers.
# This doesn't work as expected:
ret = unicodedata.normalize('NFKD', text).encode('ascii', 'ignore')
print(ret)
# ==> b'Bjorn, ukasz and .'
# Besides not supporting all characters, the returned value is a
# bytes object in python3. To yield a str type:
ret = ret.decode("utf8") # (not required in python2)