how to use pandas isin for multiple columns

后端 未结 3 2037
谎友^
谎友^ 2020-12-17 18:04

I want to find the values of col1 and col2 where the col1 and col2 of the first dataframe are

3条回答
  •  感情败类
    2020-12-17 18:51

    Perform an inner merge on col1 and col2:

    import pandas as pd
    df1 = pd.DataFrame({'col1': ['pizza', 'hamburger', 'hamburger', 'pizza', 'ice cream'], 'col2': ['boy', 'boy', 'girl', 'girl', 'boy']}, index=range(1,6))
    df2 = pd.DataFrame({'col1': ['pizza', 'pizza', 'chicken', 'cake', 'cake', 'chicken', 'ice cream'], 'col2': ['boy', 'girl', 'girl', 'boy', 'girl', 'boy', 'boy']}, index=range(10,17))
    
    print(pd.merge(df2.reset_index(), df1, how='inner').set_index('index'))
    

    yields

                col1  col2
    index                 
    10         pizza   boy
    11         pizza  girl
    16     ice cream   boy
    

    The purpose of the reset_index and set_index calls are to preserve df2's index as in the desired result you posted. If the index is not important, then

    pd.merge(df2, df1, how='inner')
    #         col1  col2
    # 0      pizza   boy
    # 1      pizza  girl
    # 2  ice cream   boy
    

    would suffice.


    Alternatively, you could construct MultiIndexs out of the col1 and col2 columns, and then call the MultiIndex.isin method:

    index1 = pd.MultiIndex.from_arrays([df1[col] for col in ['col1', 'col2']])
    index2 = pd.MultiIndex.from_arrays([df2[col] for col in ['col1', 'col2']])
    print(df2.loc[index2.isin(index1)])
    

    yields

             col1  col2
    10      pizza   boy
    11      pizza  girl
    16  ice cream   boy
    

提交回复
热议问题