I\'m looking for the fastest and idiomatic analog to SQL MINUS (AKA EXCEPT) operator.
Here is what I mean - given two Pandas DataFrames as follows:
I
One possible solution with merge and indicator=True:
df = (d1.reset_index()
.merge(d2, on=['a','b'], indicator=True, how='outer', suffixes=('','_'))
.query('_merge == "left_only"')
.set_index('index')
.rename_axis(None)
.reindex(d1.columns, axis=1))
print (df)
a b c
1 0 1 2
2 1 0 3
6 2 2 7
Solution with isin:
df = d1[~d1.set_index(["a", "b"]).index.isin(d2.set_index(["a","b"]).index)]
print (df)
a b c
1 0 1 2
2 1 0 3
6 2 2 7