问题
I have a multiindex dataframe. Index columns are Date and Symbol. I want to reset the row where the dataframe starts to evaluate rolling_max of number for each Symbol. I want to do this based on a column containing True or False. If condition is True on a Date then rolling_max should be reset and calculate max from this Date. If condition is False then rolling_max should work 'normally' - taking the max of today's and yesterday's number for the given Symbol. The condition column has nothing to do with the number column (they do not depend on each other). This is the expected output:
number condition rolling_max
Date Symbol
1990-01-01 A 29 False 29
1990-01-01 B 7 False 7
1990-01-02 A 13 True 13 # Reset rolling max for 'A'
1990-01-02 B 2 False 7
1990-01-03 A 11 False 13
1990-01-03 B 52 True 52 # Reset rolling max for 'B'
1990-01-04 A 30 False 30
1990-01-04 B 1 False 52
1990-01-05 A 19 True 19 # Reset rolling max for 'A'
1990-01-05 B 65 False 65
1990-01-06 A 17 False 19
1990-01-06 B 20 True 20 # Reset rolling max for 'B'
How can I do this?
回答1:
I was able to solve this.
df['rolling_max'] = df.groupby(['Symbol',df.groupby('Symbol')['condition'].cumsum()])['number'].cummax()
来源:https://stackoverflow.com/questions/52651800/how-to-conditionally-reset-a-rolling-maxs-initial-value-row-in-pandas-multiinde