I have a pandas.DataFrame
that I wish to export to a CSV file. However, pandas seems to write some of the values as float
instead of int
The problem is that since you are assigning things by rows, but dtypes are grouped by columns, so things get cast to object
dtype, which is not a good thing, you lose all efficiency. So one way is to convert which will coerce to float/int dtype as needed.
As we answered in another question, if you construct the frame all at once (or construct column by column) this step will not be needed
In [23]: def convert(x):
....: try:
....: return x.astype(int)
....: except:
....: return x
....:
In [24]: df.apply(convert)
Out[24]:
a b c d
x 10 10 NaN 10
y 1 5 2 3
z 1 2 3 4
In [25]: df.apply(convert).dtypes
Out[25]:
a int64
b int64
c float64
d int64
dtype: object
In [26]: df.apply(convert).to_csv('test.csv')
In [27]: !cat test.csv
,a,b,c,d
x,10,10,,10
y,1,5,2.0,3
z,1,2,3.0,4