可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I'm using pandas and I'm wondering what's the easiest way to get the business days between a start and end date using pandas?
There are a lot of posts out there regarding doing this in Python (for example), but I would be interested to use directly pandas as I think that pandas can probably handle this quite easy.
回答1:
Use BDay()
to get the business days in range.
from pandas.tseries.offsets import * In [185]: s Out[185]: 2011-01-01 -0.011629 2011-01-02 -0.089666 2011-01-03 -1.314430 2011-01-04 -1.867307 2011-01-05 0.779609 2011-01-06 0.588950 2011-01-07 -2.505803 2011-01-08 0.800262 2011-01-09 0.376406 2011-01-10 -0.469988 Freq: D In [186]: s.asfreq(BDay()) Out[186]: 2011-01-03 -1.314430 2011-01-04 -1.867307 2011-01-05 0.779609 2011-01-06 0.588950 2011-01-07 -2.505803 2011-01-10 -0.469988 Freq: B
With slicing:
In [187]: x=datetime(2011, 1, 5) In [188]: y=datetime(2011, 1, 9) In [189]: s.ix[x:y] Out[189]: 2011-01-05 0.779609 2011-01-06 0.588950 2011-01-07 -2.505803 2011-01-08 0.800262 2011-01-09 0.376406 Freq: D In [190]: s.ix[x:y].asfreq(BDay()) Out[190]: 2011-01-05 0.779609 2011-01-06 0.588950 2011-01-07 -2.505803 Freq: B
and count()
In [191]: s.ix[x:y].asfreq(BDay()).count() Out[191]: 3
回答2:
You can also use date_range
for this purpose.
In [3]: pd.date_range('2011-01-05', '2011-01-09', freq=BDay()) Out[3]: DatetimeIndex(['2011-01-05', '2011-01-06', '2011-01-07'], dtype='datetime64[ns]', freq='B', tz=None)
EDIT
Or even more simple
In [7]: pd.bdate_range('2011-01-05', '2011-01-09') Out[7]: DatetimeIndex(['2011-01-05', '2011-01-06', '2011-01-07'], dtype='datetime64[ns]', freq='B', tz=None)
Note that both start and end dates are inclusive. Source: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.bdate_range.html
回答3:
As of v0.14 you can use holiday calendars.
from pandas.tseries.holiday import USFederalHolidayCalendar from pandas.tseries.offsets import CustomBusinessDay us_bd = CustomBusinessDay(calendar=USFederalHolidayCalendar()) print pd.DatetimeIndex(start='2010-01-01',end='2010-01-15', freq=us_bd)
returns:
DatetimeIndex(['2010-01-04', '2010-01-05', '2010-01-06', '2010-01-07', '2010-01-08', '2010-01-11', '2010-01-12', '2010-01-13', '2010-01-14', '2010-01-15'], dtype='datetime64[ns]', freq='C')
回答4:
Just be careful when using bdate_range or BDay() - the name might mislead you to think that it is a range of business days, whereas in reality it's just calendar days with weekends stripped out (ie. it doesn't take holidays into account).