adding a row to a MultiIndex DataFrame/Series

跟風遠走 提交于 2019-12-29 05:19:25

问题


I was wondering if there is an equivalent way to add a row to a Series or DataFrame with a MultiIndex as there is with a single index, i.e. using .ix or .loc?

I thought the natural way would be something like

row_to_add = pd.MultiIndex.from_tuples()
df.ix[row_to_add] = my_row

but that raises a KeyError. I know I can use .append(), but I would find it much neater to use .ix[] or .loc[].

here an example:

>>> df = pd.DataFrame({'Time': [dt.datetime(2013,2,3,9,0,1), dt.datetime(2013,2,3,9,0,1)], 'hsec': [1,25], 'vals': [45,46]})
>>> df
                 Time  hsec  vals
0 2013-02-03 09:00:01     1    45
1 2013-02-03 09:00:01    25    46

[2 rows x 3 columns]
>>> df.set_index(['Time','hsec'],inplace=True)
>>> ind = pd.MultiIndex.from_tuples([(dt.datetime(2013,2,3,9,0,2),0)],names=['Time','hsec'])
>>> df.ix[ind] = 5

Traceback (most recent call last):
  File "<pyshell#201>", line 1, in <module>
    df.ix[ind] = 5
  File "C:\Program Files\Python27\lib\site-packages\pandas\core\indexing.py", line 96, in __setitem__
    indexer = self._convert_to_indexer(key, is_setter=True)
  File "C:\Program Files\Python27\lib\site-packages\pandas\core\indexing.py", line 967, in _convert_to_indexer
    raise KeyError('%s not in index' % objarr[mask])
KeyError: "[(Timestamp('2013-02-03 09:00:02', tz=None), 0L)] not in index"

回答1:


You have to specify a tuple for the multi-indexing to work (AND you have to fully specify all axes, e.g. the : is necessary)

In [26]: df.ix[(dt.datetime(2013,2,3,9,0,2),0),:] = 5

In [27]: df
Out[27]: 
                          vals
Time                hsec      
2013-02-03 09:00:01 1       45
                    25      46
2013-02-03 09:00:02 0        5

Easier to reindex and/or concat/append a new dataframe though. Generally setting (with this kind of enlargement), only makes sense if you are doing it with a small number of values. As this makes a copy when you do this.




回答2:


Update since .ix is depreciated: Today you could do:

# say you have dataframe x
x
Out[78]: 
              a    b       time
indA indB                     
a    i      0.0  NaN 2018-09-12
b    j      1.0  2.0 2018-10-12
c    k      2.0  3.0 2018-11-12
     f      NaN  NaN        NaT
d    i      5.0  NaN        NaT

x.loc[('a','k'),:] = (3.5,6,pd.NaT)

x
Out[80]: 
              a    b       time
indA indB                     
a    i      0.0  NaN 2018-09-12
b    j      1.0  2.0 2018-10-12
c    k      2.0  3.0 2018-11-12
     f      NaN  NaN        NaT
d    i      5.0  NaN        NaT
a    k      3.5  6.0        NaT


来源:https://stackoverflow.com/questions/24917700/adding-a-row-to-a-multiindex-dataframe-series

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!