Pandas datetime anchored offset for (-) MonthBegin doesn't work as expected

試著忘記壹切 提交于 2021-02-07 07:16:49

问题


I need to move back to the beginning of the month but if i'm already at the beginning I want to stay there. Pandas anchored offset with n=0 is supposed to do exactly that but it doesn't produce the expected results between the anchored points for the (-) MonthBegin .

For example for this pd.Timestamp('2017-01-06 00:00:00') - pd.tseries.offsets.MonthBegin(n=0) I expect to move me back to Timestamp('2017-01-01 00:00:00') but instead I get Timestamp('2017-02-01 00:00:00') What am I doing wrong? Or you think it's a bug?

I can also see that the same rule works fine for the MonthEnd so combining the 2 like below pd.Timestamp('2017-01-06 00:00:00')+pd.tseries.offsets.MonthEnd(n=0)-pd.tseries.offsets.MonthBegin(n=1) I get the desired effect of Timestamp('2017-01-01 00:00:00') but my expectation for it to work with just - pd.tseries.offsets.MonthBegin(n=0)


回答1:


This is indeed the correct behavior that is witnessed which are part of the rules laid out in Anchored Offset Semantics for offsets supporting start/end of a particular frequency offset.

Consider the given example:

from pandas.tseries.offsets import MonthBegin

pd.Timestamp('2017-01-02 00:00:00') - MonthBegin(n=0)
Out[18]:
Timestamp('2017-02-01 00:00:00')

Note that the anchor point corresponding to MonthBegin offset is set as first of every month. Now, since the given timestamp clearly surpasses this day, these would automatically be treated as though it were a part of the next month and rolling (whether forward or backwards) would come into play only after that.

excerpt from docs
For the case when n=0, the date is not moved if on an anchor point, otherwise it is rolled forward to the next anchor point.


To get what you're after, you need to provide n=1 which would roll the timestamp to the correct date.

pd.Timestamp('2017-01-02 00:00:00') - MonthBegin(n=1)
Out[20]:
Timestamp('2017-01-01 00:00:00')

If you had set the date on the exact anchor point, then also it would give you the desired result as per the attached docs.

pd.Timestamp('2017-01-01 00:00:00') - MonthBegin(n=0)
Out[21]:
Timestamp('2017-01-01 00:00:00')



回答2:


To jump to the month's start, use:

ts + pd.tseries.offsets.MonthEnd(n=0) - pd.tseries.offse‌​ts.MonthBegin(n=1)

Yes, it's ugly, but it's the only method to jump to the first of the month while staying there if ts is already the first.

Quick demo:

>>> pd.date_range(dt.datetime(2016,12,30), dt.datetime(2017,2,2)).to_series() \
        + MonthEnd(n=0) - MonthBegin(n=1)

2016-12-30   2016-12-01
2016-12-31   2016-12-01
2017-01-01   2017-01-01
2017-01-02   2017-01-01
...
2017-01-31   2017-01-01
2017-02-01   2017-02-01
2017-02-02   2017-02-01


来源:https://stackoverflow.com/questions/42008805/pandas-datetime-anchored-offset-for-monthbegin-doesnt-work-as-expected

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!