In [46]: d = np.random.randn(10, 1) * 2
In [47]: df = pd.DataFrame(d.astype(int), columns=['data'])
I am trying to create a cumsum column where it should reset after a sign change in data column, like this
data custom_cumsum
0 -2 -2
1 -1 -3
2 1 1
3 -3 -3
4 -1 -4
5 2 2
6 0 2
7 3 5
8 -1 -1
9 -2 -3
I am able to achieve this with df.iterrows(). I am trying to avoid iterrows and do it with vector operations. There are couple of questions on resetting cumsum when there is NaN. I am not able to achieve this cumsum with those solutions.
Create new key to groupby, then do cumsum within each group
New key Create: By using the sign change , if change we add one then it will belong to nest group
df.groupby(df.data.lt(0).astype(int).diff().ne(0).cumsum()).data.cumsum()
Out[798]:
0 -2
1 -3
2 1
3 -3
4 -4
5 2
6 2
7 5
8 -1
9 -3
Name: data, dtype: int64
来源:https://stackoverflow.com/questions/49390300/how-to-reset-cumsum-after-change-in-sign-of-values