Python: Convert Unicode to ASCII without errors for CSV file

后端 未结 1 1089
忘了有多久
忘了有多久 2020-12-18 06:28

I\'ve been reading all questions regarding conversion from Unicode to CSV in Python here in StackOverflow and I\'m still lost. Everytime I receive a \"UnicodeEncodeError: \'

相关标签:
1条回答
  • 2020-12-18 06:45

    Correct, ñ is not a valid ASCII character, so you can't encode it to ASCII. So you can, as your code does above, ignore them. Another way, namely to remove the accents, you can find here: What is the best way to remove accents in a Python unicode string?

    But note that both techniques can result in bad effects, like making words actually mean something different, etc. So the best is to keep the accents. And then you can't use ASCII, but you can use another encoding. UTF-8 is the safe bet. Latin-1 or ISO-88591-1 is common one, but it includes only Western European characters. CP-1252 is common on Windows, etc, etc.

    So just switch "ascii" for whatever encoding you want.


    Your actual code, according to your comment is:

    writer.writerow([s.encode('utf8') if type(s) is unicode else s for s in row]) 
    

    where

    row = (56, u"LIMPIADOR BA\xd1O 1'5 L")
    

    Now, I believe that should work, but apparently it doesn't. I think unicode gets passed into the cvs writer by mistake anyway. Unwrap that long line to it's parts:

    col1, col2 = row # Use the names of what is actually there instead
    row = col1, col2.encode('utf8')
    writer.writerow(row) 
    

    Now your real error will not be hidden by the fact that you stick everything in the same line. This could also probably have been avoided if you had included a proper traceback.

    0 讨论(0)
提交回复
热议问题