Pandas dataframe - running sum with reset

后端 未结 1 1801
情深已故
情深已故 2020-12-03 07:27

I want to calculate the running sum in a given column(without using loops, of course). The caveat is that I have this other column that specifies when to reset the running s

相关标签:
1条回答
  • 2020-12-03 07:35

    You can use 2 times cumsum():

    #   reset  val  desired_col
    #0      0    1            1
    #1      0    5            6
    #2      0    4           10
    #3      1    2            2
    #4      1   -1           -1
    #5      0    6            5
    #6      0    4            9
    #7      1    2            2
    df['cumsum'] = df['reset'].cumsum()
    #cumulative sums of groups to column des
    df['des']= df.groupby(['cumsum'])['val'].cumsum()
    print df
    #   reset  val  desired_col  cumsum  des
    #0      0    1            1       0    1
    #1      0    5            6       0    6
    #2      0    4           10       0   10
    #3      1    2            2       1    2
    #4      1   -1           -1       2   -1
    #5      0    6            5       2    5
    #6      0    4            9       2    9
    #7      1    2            2       3    2
    #remove columns desired_col and cumsum
    df = df.drop(['desired_col', 'cumsum'], axis=1)
    print df
    #   reset  val  des
    #0      0    1    1
    #1      0    5    6
    #2      0    4   10
    #3      1    2    2
    #4      1   -1   -1
    #5      0    6    5
    #6      0    4    9
    #7      1    2    2
    
    0 讨论(0)
提交回复
热议问题