Computing diffs within groups of a dataframe

前端未结

关注

 6  397

你的背包 2020-11-30 19:21

Say I have a dataframe with 3 columns: Date, Ticker, Value (no index, at least to start with). I have many dates and many tickers, but each (ticker, date) tupl

6条回答

悲&欢浪女 (楼主)

2020-11-30 19:55

Here is a solution that builds on what @behzad.nouri wrote, but using pd.IndexSlice:

df =  df.set_index(['ticker', 'date']).sort_index()[['value']]
df['diff'] = np.nan
idx = pd.IndexSlice

for ix in df.index.levels[0]:
    df.loc[ idx[ix,:], 'diff'] = df.loc[idx[ix,:], 'value' ].diff()

For:

> df
   date ticker  value
0    63      C   1.65
1    88      C  -1.93
2    22      C  -1.29
3    76      A  -0.79
4    72      B  -1.24
5    34      A  -0.23
6    92      B   2.43
7    22      A   0.55
8    32      A  -2.50
9    59      B  -1.01

It returns:

> df
             value  diff
ticker date             
A      22     0.55   NaN
       32    -2.50 -3.05
       34    -0.23  2.27
       76    -0.79 -0.56
B      59    -1.01   NaN
       72    -1.24 -0.23
       92     2.43  3.67
C      22    -1.29   NaN
       63     1.65  2.94
       88    -1.93 -3.58

0 讨论(0)

查看其它6个回答