Pandas analogue to SQL MINUS / EXCEPT operator, using multiple columns

后端 未结 5 1756
离开以前
离开以前 2020-11-30 15:35

I\'m looking for the fastest and idiomatic analog to SQL MINUS (AKA EXCEPT) operator.

Here is what I mean - given two Pandas DataFrames as follows:

I         


        
5条回答
  •  刺人心
    刺人心 (楼主)
    2020-11-30 15:42

    We can use pandas.concat with drop_duplicates here and pass it the argument to drop all duplicates with keep=False:

    pd.concat([d1, d2]).drop_duplicates(['a', 'b'], keep=False)
    
       a  b  c
    1  0  1  2
    2  1  0  3
    6  2  2  7
    

    Edit after comment by OP

    If you want to make sure that unique rows in df2 arnt taken into account, we can duplicate that df:

    pd.concat([d1, pd.concat([d2]*2)]).drop_duplicates(['a', 'b'], keep=False)
    
       a  b  c
    1  0  1  2
    2  1  0  3
    6  2  2  7
    

提交回复
热议问题