I have two dataframes:
all_data:
AID VID Freq
0 00016A3E 0127C661 1
1 00016A3E 0C05DA5D 2
2 0001
If you don't have duplicates in all_data, here is a way without merge but with drop_duplicates. keep=False argument drops all duplicates rows:
pd.concat([all_data, subset]).drop_duplicates(keep=False)
Out[23]:
AID VID Freq
0 00016A3E 0127C661 1
2 00016A3E 0C032814 1
3 00016A3E 0BF6C78D 1
4 00016A3E 0A79DFF1 1
5 00016A3E 07BD2FB2 1
6 00016A3E 0790E61B 1
8 00016A3E 073630B5 3
9 00016A3E 06613535 1
10 00016A3E 05F809AF 1
11 00016A3E 05C625FF 1
12 00016A3E 04220EA8 4
13 00016A3E 013A29E5 1
14 00016A3E 0761C98A 1
15 00016AE9 0A769475 16
16 00016AE9 0A7DED0A 2
17 00016AE9 0ABF60DF 9
18 00016AE9 0AE3F25A 2
20 00016AE9 0BD8975A 2
21 00016AE9 44DF880B 1
22 00016AE9 43F9E08E 2
23 00016AE9 44EA5E08 2
24 00016AE9 4539ED1E 16
25 00016AE9 8516B55A 4
26 00016AE9 0972AFF2 1
27 00016AE9 0C559B34 1
28 00016AE9 06B5C040 7
%timeit pd.concat([all_data, subset]).drop_duplicates(keep=False)
1000 loops, best of 3: 1.53 ms per loop
%%timeit
pd.merge(all_data, subset, how='left', indicator=True) \
.query("_merge == 'left_only'") \
.drop('_merge',1)
100 loops, best of 3: 8.58 ms per loop