group data by season according to the exact dates

六月ゝ 毕业季﹏ 提交于 2019-11-29 11:35:58

You need DatetimeIndex.dayofyear:

data['SEASON'] = data.index.dayofyear.map(season)

Another solution with pandas.cut:

bins = [0, 91, 183, 275, 366]
labels=['Winter', 'Spring', 'Summer', 'Fall']
doy = data.index.dayofyear
data['SEASON1'] = pd.cut(doy + 11 - 366*(doy > 355), bins=bins, labels=labels)

pandas.cut
In order to properly handle 'Winter' being both at the beginning and end of the year, I shifted the dayofyear by 11 and took the results modulo 366. The reason I don't use the same technique as in the numpy solution below is that pd.cut returns a categorical type and I would end up with 5 categories in which two categories had the same label. I could then cast the result as string, but that felt sloppy.

data['SEASON'] = pd.cut(
    (data.index.dayofyear + 11) % 366,
    [0, 91, 183, 275, 366],
    labels=['Winter', 'Spring', 'Summer', 'Fall']
)

numpy.searchsorted
In order to properly handle 'Winter' being both at the beginning and end of the year, I allowed two bins for 'Winter'

seasons = np.array(['Winter', 'Spring', 'Summer', 'Fall', 'Winter'])
f = np.searchsorted([80, 172, 264, 355], data.index.dayofyear)
data['SEASON'] = seasons[f]

plot

data.groupby('SEASON')['impact'].mean().plot.bar()

Looks like:

data['SEASON'] = data.index.to_series().dt.**month**.map(lambda x : season(x))

uses the month presumably 1-12 or 0-11 which are all "winter". You need to use the day of year.

But you could probably have seen this more easily and made it possible to print to check it yourself if you hadn't locked the extraction of the day away inside a one-liner. Just saying.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!