Pandas - Number of Months Between Two Dates

夙愿已清 提交于 2019-11-28 18:21:17

Here is a very simple answer my friend:

df['nb_months'] = ((df.date2 - df.date1)/np.timedelta64(1, 'M'))

and now:

df['nb_months'] = df['nb_months'].astype(int)
df.assign(
    Months=
    (df.Date2.dt.year - df.Date1.dt.year) * 12 +
    (df.Date2.dt.month - df.Date1.dt.month)
)

       Date1      Date2  Months
0 2016-04-07 2017-02-01      10
1 2017-02-01 2017-03-05       1

An alternative, possibly more elegant solution is df.Date2.dt.to_period('M') - df.Date1.dt.to_period('M'), which avoids rounding errors.

There are two notions of difference in time, which are both correct in a certain sense. Let us compare the difference in months between July 31 and September 01:

import numpy as np
import pandas as pd

dtr = pd.date_range(start="2016-07-31", end="2016-09-01", freq="D")
delta1 = int((dtr[-1] - dtr[0])/np.timedelta64(1,'M'))
delta2 = (dtr[-1].to_period('M') - dtr[0].to_period('M')).n
print(delta1,delta2)

Using numpy's timedelta, delta1=1, which is correct given that there is only one month in between, but delta2=2, which is also correct given that September is still two months away in July. In most cases, both will give the same answer, but one might be more correct than the other given the context.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!