Python Pandas to_csv Output Returns Single Character for String/Object Values

£可爱£侵袭症+ 提交于 2019-12-24 11:37:50

问题


I'm attempting to output the result into a pandas data frame. When I print the data frame, the object values appear correct, but when I use the to_csv function on the data frame, my csv output has only the first character for every string/object value.

df = pandas.DataFrame({'a':[u'u\x00s\x00']})
df.to_csv('test.csv')

I've also tried the following addition to the to_csv function:

df.to_csv('test_encoded.csv', encoding= 'utf-8')

But am getting the same results:

>>> print df
      a
0  us

(output in csv file)
u

For reference, I'm connecting to a Vertica database and using the following setup:

  • OS: Mac OS X Yosemite (10.10.5)
  • Python 2.7.10 |Anaconda 2.3.0 (x86_64)| (default, Sep 15 2015, 14:29:08)
  • pyodbc 3.0.10
  • pandas 0.16.2
  • ODBC: Vertica ODBC 6.1.3

Any help figuring out how to pass the entire object string using the to_csv function in pandas would be greatly appreciated.


回答1:


I was facing the same problem and found this post UTF-32 in Python

To fix your problem, I believe that you need to replace all '\x00' by empty. I managed to write the correct CSV with the code below

fixer = dict.fromkeys([0x00], u'')
df['a'] = df['a'].map(lambda x: x.translate(fixer))
df.to_csv('test.csv')

To solve my problem with Vertica I had to change the encoding to UTF-16 in the file /Library/Vertica/ODBC/lib/vertica.ini with the configuration below

[Driver]
ErrorMessagesPath=/Library/Vertica/ODBC/messages/
ODBCInstLib=/usr/lib/libiodbcinst.dylib
DriverManagerEncoding=UTF-16

Best regards,
Anderson Neves



来源:https://stackoverflow.com/questions/32703664/python-pandas-to-csv-output-returns-single-character-for-string-object-values

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!