pandas: subtracting current date from the date in a pandas table

限于喜欢 提交于 2020-05-23 17:47:24

问题


I am attempting to calculate the difference in days between todays and a pandas data consisting of historical data. Below is the intended code:

df['diff'] = pd.to_datetime( df['date']) - pd.datetime.now().date()

However, it produces the following error:

TypeError: unsupported operand type(s) for -: 'DatetimeIndex' and 'datetime.date'

The date column in the pandas table looks like this:

0       2018-12-18
1       2018-12-18
2       2018-12-18
3       2018-12-18
4       2018-12-18

How do I fix this error. Thanks in advance.


回答1:


You have to subtract same types - datetimes with datetime (with zero times) or dates with date.

Use Timestamp.now with Timestamp.normalize or Timestamp.floor for remove times:

df['diff'] = pd.to_datetime( df['date']) - pd.Timestamp.now().normalize() 

df['diff'] = pd.to_datetime( df['date']) - pd.Timestamp.now().floor('d')

You can also use replace:

dt = pd.datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)
df['diff'] = pd.to_datetime( df['date']) - dt

Or convert Datetimes to dates for subtract same types:

dt = datetime.datetime.now().date()
df['diff'] = pd.to_datetime(df['date']).dt.date - dt

Sample:

rng = pd.date_range('2018-04-03', periods=10, freq='100D')
df = pd.DataFrame({'date': rng}) 

df['diff'] = pd.to_datetime( df['date']) - pd.Timestamp.now().normalize() 
print (df)
        date      diff
0 2018-04-03 -261 days
1 2018-07-12 -161 days
2 2018-10-20  -61 days
3 2019-01-28   39 days
4 2019-05-08  139 days
5 2019-08-16  239 days
6 2019-11-24  339 days
7 2020-03-03  439 days
8 2020-06-11  539 days
9 2020-09-19  639 days



回答2:


There is a subtle but important distinction. Pandas supports datetime.datetime objects but does not support datetime.date objects:

from datetime import date, datetime

# TypeError: unsupported operand type(s) for -: 'DatetimeIndex' and 'datetime.date'
df['date'] - date.today()

# works correctly
df['date'] - datetime.now()

# works correctly
df['date'] - datetime.now().replace(minute=0, hour=0, second=0, microsecond=0)

Note pd.Timestamp.date returns a datetime.date object. The docs do specify this: Return date object with same year, month and day. That date object is not supported natively by Pandas in the same way datetime objects are supported.

But replacing time values is cumbersome. You will likely prefer using in-built Pandas methods for your calculations. These are all equivalent:

df['date'] - pd.Timestamp('today').floor('D')
df['date'] - pd.Timestamp.today().normalize()
df['date'] - pd.to_datetime('today').normalize()


来源:https://stackoverflow.com/questions/53867536/pandas-subtracting-current-date-from-the-date-in-a-pandas-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!