Python Pandas Series failure datetime

有些话、适合烂在心里 提交于 2019-12-23 08:35:19

问题


I think that this has to be a failure of pandas, having a pandas Series (v.18.1 and 19 too), if I assign a date to the Series, the first time it is added as int (error), the second time it is added as datetime(correct), I can not understand the reason.

For instance with this code:

import datetime as dt
import pandas as pd
series = pd.Series(list('abc'))
date = dt.datetime(2016, 10, 30, 0, 0)
series["Date_column"] =date
print("The date is {} and the type is {}".format(series["Date_column"], type(series["Date_column"])))
series["Date_column"] =date
print("The date is {} and the type is {}".format(series["Date_column"], type(series["Date_column"])))

The output is:

The date is 1477785600000000000 and the type is <class 'int'>
The date is 2016-10-30 00:00:00 and the type is <class 'datetime.datetime'>

As you can see, the first time it always sets the value as int instead of datetime.

could someone help me?, Thank you very much in advance, Javi.


回答1:


The reason for this is that series is an 'object' type and the columns of a pandas DataFrame (or a Series) are homogeneously of type. You can inspect this with dtype (or DataFrame.dtypes):

series = pd.Series(list('abc'))
series
Out[3]:
0    a
1    b
2    c
dtype: object

In [15]: date = dt.datetime(2016, 10, 30, 0, 0)
date
Out[15]: datetime.datetime(2016, 10, 30, 0, 0)

In [18]: print(date)
2016-10-30 00:00:00

In [17]: type(date)
Out[17]: datetime.datetime

In [19]: series["Date_column"] = date
In [20]: series

Out[20]:
0                                a
1                                b
2                                c
Date_column    1477785600000000000
dtype: object

In [22]: series.dtype

Out[22]: dtype('O')

Only the generic 'object' dtype can hold any python object (in your case inserting a datetime.datetime object into the Series).

Moreover, Pandas Series are based on Numpy Arrays, which are not mixed types and defeats the purpose of using the computational benefit of Pandas DataFrames and Series or Numpy.

Could you use a python list() instead? or a DataFrame()?



来源:https://stackoverflow.com/questions/40716361/python-pandas-series-failure-datetime

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!