pandas rolling computation with window based on values instead of counts

后端 未结 3 2049
暗喜
暗喜 2020-12-04 17:17

I\'m looking for a way to do something like the various rolling_* functions of pandas, but I want the window of the rolling computation to be defin

3条回答
  •  温柔的废话
    2020-12-04 17:30

    I think this does what you want:

    In [1]: df
    Out[1]:
       RollBasis  ToRoll
    0          1       1
    1          1       4
    2          1      -5
    3          2       2
    4          3      -4
    5          5      -2
    6          8       0
    7         10     -13
    8         12      -2
    9         13      -5
    
    In [2]: def f(x):
       ...:     ser = df.ToRoll[(df.RollBasis >= x) & (df.RollBasis < x+5)]
       ...:     return ser.sum()
    

    The above function takes a value, in this case RollBasis, and then indexes the data frame column ToRoll based on that value. The returned series consists of ToRoll values that meet the RollBasis + 5 criterion. Finally, that series is summed and returned.

    In [3]: df['Rolled'] = df.RollBasis.apply(f)
    
    In [4]: df
    Out[4]:
       RollBasis  ToRoll  Rolled
    0          1       1      -4
    1          1       4      -4
    2          1      -5      -4
    3          2       2      -4
    4          3      -4      -6
    5          5      -2      -2
    6          8       0     -15
    7         10     -13     -20
    8         12      -2      -7
    9         13      -5      -5
    

    Code for the toy example DataFrame in case someone else wants to try:

    In [1]: from pandas import *
    
    In [2]: import io
    
    In [3]: text = """\
       ...:    RollBasis  ToRoll
       ...: 0          1       1
       ...: 1          1       4
       ...: 2          1      -5
       ...: 3          2       2
       ...: 4          3      -4
       ...: 5          5      -2
       ...: 6          8       0
       ...: 7         10     -13
       ...: 8         12      -2
       ...: 9         13      -5
       ...: """
    
    In [4]: df = read_csv(io.BytesIO(text), header=0, index_col=0, sep='\s+')
    

提交回复
热议问题