问题
I have a time series of daily data from 1992 to 2018. So far I have converted to monthly data but I also need to obtain anomalies per month and I need to obtain the average of each month over all years to finish with 12 averages. One for each month from each individual average of each year.
I have done the following using Pandas:
df = pd.read_excel(filename, "Daily", index_col=0)
df = df.resample("M").mean()
I have been trying to find how out to obtain now the average of each month every the whole time series but I have not found a way.
EDIT:
My data looks like this after resampling daily to monthly
1 2 3 ... 37 38 39
1992-01-31 0.306511 0.310543 0.211181 ... 0.352130 0.348108 0.304041
1992-02-29 0.306236 0.312186 0.211741 ... 0.353696 0.343948 0.311114
1992-03-31 0.259254 0.297998 0.195577 ... 0.329181 0.294966 0.278523
1992-04-30 0.229502 0.297078 0.186802 ... 0.298462 0.267629 0.249950
1992-05-31 0.188347 0.240783 0.159703 ... 0.251465 0.215796 0.205284
1992-06-30 0.150345 0.213644 0.129967 ... 0.220702 0.179280 0.178172
1992-07-31 0.144945 0.213217 0.118467 ... 0.224497 0.163502 0.171851
1992-08-31 0.145402 0.188320 0.115089 ... 0.209280 0.159910 0.158608
1992-09-30 0.151685 0.194237 0.123106 ... 0.216324 0.174529 0.154490
1992-10-31 0.169207 0.235069 0.129761 ... 0.240324 0.197842 0.172253
1992-11-30 0.223199 0.271601 0.175349 ... 0.280514 0.258155 0.223209
1992-12-31 0.241892 0.302605 0.192563 ... 0.328505 0.289020 0.256858
1993-01-31 0.263852 0.351839 0.207057 ... 0.362024 0.340665 0.278063
1993-02-28 0.309779 0.392905 0.244505 ... 0.374407 0.386738 0.330977
1993-03-31 0.301839 0.364442 0.230318 ... 0.377743 0.344132 0.336906
1993-04-30 0.271325 0.317197 0.209343 ... 0.345088 0.306911 0.289592
(Date is the index, not column 1. Column 1 starts with 0.306511, and so on) All the way to end of 2018. So I need to obtain the average of all the Januaries, all the Februaries, etc. for each one of the columns.
回答1:
What you need is groupby
:
m = df['Date'].dt.month
result = df.groupby(m).mean()
# Rename month 1 to January, 2 to February, etc.
result.index = pd.date_range('1/1/2019', '12/1/2019', freq='MS').strftime('%B')
Result (with random input):
Value
January 51.838811
February 51.455804
March 51.275257
April 52.027894
May 49.101480
June 51.866638
July 51.600765
August 50.416463
September 48.732991
October 51.477874
November 50.797786
December 51.003006
来源:https://stackoverflow.com/questions/57914948/get-average-by-months-of-a-time-series-all-januaries-all-februaries-etc