Fixing mojibakes in UTF-8 text
问题 I have a file with text in Portuguese in UTF-8. Somehow, who produced the file selected the wrong encoding, and the text is full of mojibake: IDENTIFICAÌàÌÄO instead of identificação André instead of André Automated tools do not see anything wrong with the file. I tried to fix it with Python package ftfy to no avail. How can I fix this file, apart from replacing all incorrect characters manually? 回答1: "André" instead of "André" is the Latin-1 interpretation of UTF-8 encoding. You can fix it