Filtering pandas dataframe by day

我们两清 提交于 2019-12-01 17:47:27

Avoid Python datetime

First you should avoid combining Python datetime with Pandas operations. There are many Pandas / NumPy friendly methods to create datetime objects for comparison, e.g. pd.Timestamp and pd.to_datetime. Your performance issues here are partly due to this behaviour described in the docs:

pd.Series.dt.date returns an array of python datetime.date objects

Using object dtype in this way removes vectorisation benefits, as operations then require Python-level loops.

Use groupby operations for aggregating by date

Pandas already has functionality to group by date via normalizing time:

for day, df_day in df.groupby(df.index.floor('d')):
    df_day_t = df_day.between_time('08:30', '09:30')
    # do something

As another example, you can access a slice for a particular day in this way:

g = df.groupby(df.index.floor('d'))
my_day = pd.Timestamp('2017-01-01')
df_slice = g.get_group(my_day)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!