add a field in pandas dataframe with MultiIndex columns

岁酱吖の 提交于 2019-12-03 02:51:23

You could also (as a workaround since there isn't really an API that does exactly what you want ) consider a bit of reshaping-fu if you don't want to use a Panel. I wouldn't recommend it on enormous data sets, though: use a Panel for that.

In [30]: df = dftst.stack(0)

In [31]: df['close_avg'] = pd.rolling_mean(df.close.unstack(), 5).stack()

In [32]: df
Out[32]: 
field                          close      rate  close_avg
                    ticker                               
2009-03-01 06:29:59 AAPL   -0.223042  0.554996        NaN
                    GOOG    0.060127 -0.333992        NaN
                    GS      0.117626 -1.256790        NaN
2009-03-02 06:29:59 AAPL   -0.513743 -0.402661        NaN
                    GOOG    0.059828 -0.125288        NaN
                    GS     -0.336196 -0.510595        NaN
2009-03-03 06:29:59 AAPL    0.142202 -1.038470        NaN
                    GOOG   -1.099251 -0.892581        NaN
                    GS      1.698086  0.885023        NaN
2009-03-04 06:29:59 AAPL   -1.125821  0.413005        NaN
                    GOOG    0.424290  1.106983        NaN
                    GS      0.047158  0.680714        NaN
2009-03-05 06:29:59 AAPL    0.470050  1.845354  -0.250071
                    GOOG    0.132956 -0.488800  -0.084410
                    GS      0.129190  0.208077   0.331173
2009-03-06 06:29:59 AAPL   -0.087360 -2.102512  -0.222934
                    GOOG    0.165100 -0.134886  -0.063415
                    GS      0.167720  0.082480   0.341192
2009-03-07 06:29:59 AAPL   -0.768542 -0.176076  -0.273894
                    GOOG    0.417694  2.257074   0.008158
                    GS     -1.744730 -1.850185   0.059485
2009-03-08 06:29:59 AAPL   -0.297363 -0.633828  -0.361807
                    GOOG   -1.096703 -0.572138   0.008667
                    GS      0.890016 -2.621563  -0.102129
2009-03-09 06:29:59 AAPL    1.038579  0.053330   0.071073
                    GOOG   -0.614050  0.607944  -0.199001
                    GS     -0.882848  0.596801  -0.288130
2009-03-10 06:29:59 AAPL   -0.255226  0.058178  -0.073982
                    GOOG    1.761861  1.841751   0.126780
                    GS     -0.549998 -1.551281  -0.423968
2009-03-11 06:29:59 AAPL    0.413522  0.149089   0.026194
                    GOOG   -2.964163  1.825312  -0.499072
                    GS     -0.373303  1.137001  -0.532173
2009-03-12 06:29:59 AAPL   -0.924776  1.238546  -0.005053
                    GOOG   -0.985956 -0.906590  -0.779802
                    GS     -0.320400  1.239681  -0.247307

I don't know how to do the broadcasting you want but for strict assignment this should do it:

dftst[(('GOOG', 'avg_close'))] = 7 

More specifically but still without broadcasting:

for tic in cols_1:
   dftst[(tic, 'avg_close')] = pandas.rolling_mean(dftst[(tic, 'close')],5) 

This is a decade old but I had the exact same problem. here is a 1 line way to do what you are looking for. pandas 0.18 as been introduce so rolling mean is a bit different now, but you get the point.

avg_close = dftst.xs('close', axis=1, level=1).rolling(5).mean()   
dftst[zip(avg_close.columns, ['avg_close']*len(avg_close.columns))] = avg_close

for this particular problem, it seems like using a Panel object works. I did the following (taking dftst from my original post):

pn = dftst.T.to_panel()
print pn

Out[83]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 12 (items) x 3 (major_axis) x 2 (minor_axis)
Items axis: 2009-03-01 06:29:59 to 2009-03-12 06:29:59
Major_axis axis: AAPL to GS
Minor_axis axis: close to rate

If I move the ('close', 'rate') to the Items by doing the following:

pn = pn.transpose(2,0,1)
print pn

Out[91]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 2 (items) x 12 (major_axis) x 3 (minor_axis)
Items axis: close to rate
Major_axis axis: 2009-03-01 06:29:59 to 2009-03-12 06:29:59
Minor_axis axis: AAPL to GS

Now I can do a time series operation and add it as a field in the Panel object:

pn['avg_close'] = pandas.rolling_mean(pn['close'], 5)
print pn

Out[93]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 3 (items) x 12 (major_axis) x 3 (minor_axis)
Items axis: close to avg_close
Major_axis axis: 2009-03-01 06:29:59 to 2009-03-12 06:29:59
Minor_axis axis: AAPL to GS

print pn['avg_close']

Out[94]: 
ticker                   AAPL      GOOG        GS
2009-03-01 06:29:59       NaN       NaN       NaN
2009-03-02 06:29:59       NaN       NaN       NaN
2009-03-03 06:29:59       NaN       NaN       NaN
2009-03-04 06:29:59       NaN       NaN       NaN
2009-03-05 06:29:59  0.303719 -0.129300 -0.037954
2009-03-06 06:29:59 -0.006839  0.206331  0.336467
2009-03-07 06:29:59  0.128299  0.174935  0.698275
2009-03-08 06:29:59  0.471010 -0.137343  0.671049
2009-03-09 06:29:59 -0.279855 -0.033427  0.848610
2009-03-10 06:29:59 -0.516032  0.260944  0.373046
2009-03-11 06:29:59 -0.456213  0.164710  0.910448
2009-03-12 06:29:59 -0.799156  0.544132  0.862764

I am actually having some other problems with the Panel objects, but I will leave those to another post.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!