Computing diffs within groups of a dataframe

前端 未结 6 397
你的背包
你的背包 2020-11-30 19:21

Say I have a dataframe with 3 columns: Date, Ticker, Value (no index, at least to start with). I have many dates and many tickers, but each (ticker, date) tupl

6条回答
  •  悲&欢浪女
    2020-11-30 19:55

    Here is a solution that builds on what @behzad.nouri wrote, but using pd.IndexSlice:

    df =  df.set_index(['ticker', 'date']).sort_index()[['value']]
    df['diff'] = np.nan
    idx = pd.IndexSlice
    
    for ix in df.index.levels[0]:
        df.loc[ idx[ix,:], 'diff'] = df.loc[idx[ix,:], 'value' ].diff()
    

    For:

    > df
       date ticker  value
    0    63      C   1.65
    1    88      C  -1.93
    2    22      C  -1.29
    3    76      A  -0.79
    4    72      B  -1.24
    5    34      A  -0.23
    6    92      B   2.43
    7    22      A   0.55
    8    32      A  -2.50
    9    59      B  -1.01
    

    It returns:

    > df
                 value  diff
    ticker date             
    A      22     0.55   NaN
           32    -2.50 -3.05
           34    -0.23  2.27
           76    -0.79 -0.56
    B      59    -1.01   NaN
           72    -1.24 -0.23
           92     2.43  3.67
    C      22    -1.29   NaN
           63     1.65  2.94
           88    -1.93 -3.58
    

提交回复
热议问题