python pandas conditional cumulative sum

前端 未结 3 1101
一个人的身影
一个人的身影 2020-12-15 10:49

Consider my dataframe df

data  data_binary  sum_data
  2       1            1
  5       0            0
  1       1            1
  4       1              


        
3条回答
  •  暖寄归人
    2020-12-15 11:05

    I think you can groupby with DataFrameGroupBy.cumsum by Series, where first compare next value by shifted column if not equal (!=) and then create groups by cumsum. Last replace 0 by column data_binary with mask:

    print (df.data_binary.ne(df.data_binary.shift()).cumsum())
    0    1
    1    2
    2    3
    3    3
    4    3
    5    4
    6    4
    7    5
    Name: data_binary, dtype: int32
    
    df['sum_data1'] = df.data_binary.groupby(df.data_binary.ne(df.data_binary.shift()).cumsum())
                                    .cumsum()
    df['sum_data1'] = df['sum_data1'].mask(df.data_binary == 0, 0)
    print (df)
       data  data_binary  sum_data  sum_data1
    0     2            1         1          1
    1     5            0         0          0
    2     1            1         1          1
    3     4            1         2          2
    4     3            1         3          3
    5    10            0         0          0
    6     7            0         0          0
    7     3            1         1          1
    

提交回复
热议问题