I want to find the values of col1
and col2
where the col1
and col2
of the first dataframe are
Perform an inner merge on col1
and col2
:
import pandas as pd
df1 = pd.DataFrame({'col1': ['pizza', 'hamburger', 'hamburger', 'pizza', 'ice cream'], 'col2': ['boy', 'boy', 'girl', 'girl', 'boy']}, index=range(1,6))
df2 = pd.DataFrame({'col1': ['pizza', 'pizza', 'chicken', 'cake', 'cake', 'chicken', 'ice cream'], 'col2': ['boy', 'girl', 'girl', 'boy', 'girl', 'boy', 'boy']}, index=range(10,17))
print(pd.merge(df2.reset_index(), df1, how='inner').set_index('index'))
yields
col1 col2
index
10 pizza boy
11 pizza girl
16 ice cream boy
The purpose of the reset_index
and set_index
calls are to preserve df2
's index as in the desired result you posted. If the index is not important, then
pd.merge(df2, df1, how='inner')
# col1 col2
# 0 pizza boy
# 1 pizza girl
# 2 ice cream boy
would suffice.
Alternatively, you could construct MultiIndexs out of the col1
and col2
columns, and then call the MultiIndex.isin method:
index1 = pd.MultiIndex.from_arrays([df1[col] for col in ['col1', 'col2']])
index2 = pd.MultiIndex.from_arrays([df2[col] for col in ['col1', 'col2']])
print(df2.loc[index2.isin(index1)])
yields
col1 col2
10 pizza boy
11 pizza girl
16 ice cream boy