Pandas df.to_csv(“file.csv” encode=“utf-8”) still gives trash characters for minus sign

后端 未结 1 887
自闭症患者
自闭症患者 2020-12-23 09:28

I\'ve read something about a Python 2 limitation with respect to Pandas\' to_csv( ... etc ...). Have I hit it? I\'m on Python 2.7.3

This turns out trash characters

相关标签:
1条回答
  • 2020-12-23 10:02

    Your "bad" output is UTF-8 displayed as CP1252.

    On Windows, many editors assume the default ANSI encoding (CP1252 on US Windows) instead of UTF-8 if there is no byte order mark (BOM) character at the start of the file. While a BOM is meaningless to the UTF-8 encoding, its UTF-8-encoded presence serves as a signature for some programs. For example, Microsoft Office's Excel requires it even on non-Windows OSes. Try:

    df.to_csv('file.csv',encoding='utf-8-sig')
    

    That encoder will add the BOM.

    0 讨论(0)
提交回复
热议问题