I have a pandas dataframe with 3 levels of a MultiIndex. I am trying to pull out rows of this dataframe according to a list of values that correspond to two of the levels.>
I find it interesting that this doesn't work:
In [45]: df.loc[(idx[:, 'foo', 'can'], idx[:, 'bar', 'baz']), ]
Out[45]:
hi
a b c
1 bar baz 2
can 3
foo baz 0
can 1
2 bar baz 6
can 7
foo baz 4
can 5
3 bar baz 10
can 11
foo baz 8
can 9
It sort of looks like it "should", somehow. In any case, here's a reasonable workaround:
Let's assume the tuples you want to slice by are in the index of another DataFrame (since it sounds like they probably are in your case!).
In [53]: ix_use = pd.MultiIndex.from_tuples([('foo', 'can'), ('bar', 'baz')], names=['b', 'c'])
In [55]: other = pd.DataFrame(dict(a=1), index=ix_use)
In [56]: other
Out[56]:
a
b c
foo can 1
bar baz 1
Now to slice df by the index of other we can use the fact that .loc/.ix allow you to give a list of tuples (see the last example here).
First let's build the list of tuples we want:
In [13]: idx = [(x, ) + y for x in df.index.levels[0] for y in other.index.values]
In [14]: idx
Out[14]:
[(1, 'foo', 'can'),
(1, 'bar', 'baz'),
(2, 'foo', 'can'),
(2, 'bar', 'baz'),
(3, 'foo', 'can'),
(3, 'bar', 'baz')]
Now we can pass this list to .ix or .loc:
In [17]: df.ix[idx]
Out[17]:
hi
a b c
1 foo can 1
bar baz 2
2 foo can 5
bar baz 6
3 foo can 9
bar baz 10