I have two dataframes. df1 is multi-indexed:
value
first second
a x 0.471780
y 0.774908
z 0.563634
b
According to the documentation, as of pandas 0.14, you can simply join single-index and multiindex dataframes. It will match on the common index name. The how argument works as expected with 'inner' and 'outer', though interestingly it seems to be reversed for 'left' and 'right' (could this be a bug?).
df1 = pd.DataFrame([['a', 'x', 0.471780], ['a','y', 0.774908], ['a', 'z', 0.563634],
['b', 'x', -0.353756], ['b', 'y', 0.368062], ['b', 'z', -1.721840],
['c', 'x', 1], ['c', 'y', 2], ['c', 'z', 3],
],
columns=['first', 'second', 'value1']
).set_index(['first', 'second'])
df2 = pd.DataFrame([['a', 10], ['b', 20]],
columns=['first', 'value2']).set_index(['first'])
print(df1.join(df2, how='inner'))
value1 value2
first second
a x 0.471780 10
y 0.774908 10
z 0.563634 10
b x -0.353756 20
y 0.368062 20
z -1.721840 20
You could use get_level_values:
firsts = df1.index.get_level_values('first')
df1['value2'] = df2.loc[firsts].values
Note: you are almost doing a join here (except the df1 is MultiIndex)... so there may be a neater way to describe this...
.
In an example (similar to what you have):
df1 = pd.DataFrame([['a', 'x', 0.123], ['a','x', 0.234],
['a', 'y', 0.451], ['b', 'x', 0.453]],
columns=['first', 'second', 'value1']
).set_index(['first', 'second'])
df2 = pd.DataFrame([['a', 10],['b', 20]],
columns=['first', 'value']).set_index(['first'])
firsts = df1.index.get_level_values('first')
df1['value2'] = df2.loc[firsts].values
In [5]: df1
Out[5]:
value1 value2
first second
a x 0.123 10
x 0.234 10
y 0.451 10
b x 0.453 20
As the .ix syntax is a powerful shortcut to reindexing, but in this case you are actually not doing any combined rows/column reindexing, this can be done a bit more elegantly (for my humble taste buds) with just using reindexing:
Preparation from hayden:
df1 = pd.DataFrame([['a', 'x', 0.123], ['a','x', 0.234],
['a', 'y', 0.451], ['b', 'x', 0.453]],
columns=['first', 'second', 'value1']
).set_index(['first', 'second'])
df2 = pd.DataFrame([['a', 10],['b', 20]],
columns=['first', 'value']).set_index(['first'])
Then this looks like this in iPython:
In [4]: df1
Out[4]:
value1
first second
a x 0.123
x 0.234
y 0.451
b x 0.453
In [5]: df2
Out[5]:
value
first
a 10
b 20
In [7]: df2.reindex(df1.index, level=0)
Out[7]:
value
first second
a x 10
x 10
y 10
b x 20
In [8]: df1['value2'] = df2.reindex(df1.index, level=0)
In [9]: df1
Out[9]:
value1 value2
first second
a x 0.123 10
x 0.234 10
y 0.451 10
b x 0.453 20
The mnemotechnic for what level you have to use in the reindex method: It states for the level that you already covered in the bigger index. So, in this case df2 already had level 0 covered of the df1.index.