问题
This is probably pretty easy, but for some reason I am finding it quite difficult to complete. Any tips would be greatly appreciated. I have some time series data consisting of 5-minute intervals each day, ala:
Date Values
2012-12-05 09:30:00 5
2012-12-05 09:35:00 7
2012-12-05 09:40:00 3
2012-12-05 09:45:00 2
2012-12-05 09:50:00 15
2012-12-06 09:30:00 4
2012-12-06 09:35:00 3
2012-12-06 09:40:00 8
2012-12-06 09:45:00 1
I would like to calculate the differences relative to the first value of the day (which in this case always will be the 9:30 value), ie. end up with this DataFrame:
Date Values
2012-12-05 09:30:00 0
2012-12-05 09:35:00 2
2012-12-05 09:40:00 -2
2012-12-05 09:45:00 -3
2012-12-05 09:50:00 10
2012-12-06 09:30:00 0
2012-12-06 09:35:00 -1
2012-12-06 09:40:00 4
2012-12-06 09:45:00 -3
回答1:
You need substract by Series
created transform with groupby by Series.dt.date and first:
print (df.Values.groupby(df.Date.dt.day).transform('first'))
0 5
1 5
2 5
3 5
4 5
5 4
6 4
7 4
8 4
Name: Values, dtype: int64
df.Values = df.Values - df.Values.groupby(df.Date.dt.day).transform('first')
print (df)
Date Values
0 2012-12-05 09:30:00 0
1 2012-12-05 09:35:00 2
2 2012-12-05 09:40:00 -2
3 2012-12-05 09:45:00 -3
4 2012-12-05 09:50:00 10
5 2012-12-06 09:30:00 0
6 2012-12-06 09:35:00 -1
7 2012-12-06 09:40:00 4
8 2012-12-06 09:45:00 -3
回答2:
You can use broadcasting:
df.Values - df.Values.iloc[0]
来源:https://stackoverflow.com/questions/40104449/pandas-calculating-daily-differences-relative-to-earliest-value