I have this DataFrame (df1) in Pandas:
df1 = pd.DataFrame(np.random.rand(10,4),columns=list(\'ABCD\'))
print df1
A B C
@Andrew: I believe I found a way to drop the rows of one dataframe that are already present in another (i.e. to answer my EDIT) without using loops - let me know if you disagree and/or if my OP + EDIT did not clearly state this:
THIS WORKS
The columns for both dataframes are always the same - A, B, C and D. With this in mind, based heavily on Andrew's approach, here is how to drop the rows from df2 that are also present in df1:
common_cols = df1.columns.tolist() #generate list of column names
df12 = pd.merge(df1, df2, on=common_cols, how='inner') #extract common rows with merge
df2 = df2[~df2['A'].isin(df12['A'])]
Line 3 does the following:
df2 that do not match rows in df1:A to make this comparison - it isNOTE: this method is essentially the equivalent of the SQL NOT IN().