Python Pandas to_csv Output Returns Single Character for String/Object Values

问题

I'm attempting to output the result into a pandas data frame. When I print the data frame, the object values appear correct, but when I use the to_csv function on the data frame, my csv output has only the first character for every string/object value.

df = pandas.DataFrame({'a':[u'u\x00s\x00']})
df.to_csv('test.csv')

I've also tried the following addition to the to_csv function:

df.to_csv('test_encoded.csv', encoding= 'utf-8')

But am getting the same results:

>>> print df
      a
0  us

(output in csv file)
u

For reference, I'm connecting to a Vertica database and using the following setup:

OS: Mac OS X Yosemite (10.10.5)
Python 2.7.10 |Anaconda 2.3.0 (x86_64)| (default, Sep 15 2015, 14:29:08)
pyodbc 3.0.10
pandas 0.16.2
ODBC: Vertica ODBC 6.1.3

Any help figuring out how to pass the entire object string using the to_csv function in pandas would be greatly appreciated.

回答1:

I was facing the same problem and found this post UTF-32 in Python

To fix your problem, I believe that you need to replace all '\x00' by empty. I managed to write the correct CSV with the code below

fixer = dict.fromkeys([0x00], u'')
df['a'] = df['a'].map(lambda x: x.translate(fixer))
df.to_csv('test.csv')

To solve my problem with Vertica I had to change the encoding to UTF-16 in the file /Library/Vertica/ODBC/lib/vertica.ini with the configuration below

[Driver]
ErrorMessagesPath=/Library/Vertica/ODBC/messages/
ODBCInstLib=/usr/lib/libiodbcinst.dylib
DriverManagerEncoding=UTF-16

Best regards,
Anderson Neves

来源：https://stackoverflow.com/questions/32703664/python-pandas-to-csv-output-returns-single-character-for-string-object-values

标签

python-2.7

pandas

unicode

pyodbc

vertica