I have 2 dataframes df1 and df2 of different size.
df1 = pd.DataFrame({\'A\':[np.nan, np.nan, np.nan, \'AAA\',\'SSS\',\'DDD\'], \'B\':[np.nan,np.nan,\'ciao\'
I believe need isin withboolean indexing :
Also omit NaN
s rows by default chain new condition:
#changed df2 with no NaN in C column
df2 = pd.DataFrame({'C':[4, 5, 5, 'SSS','FFF','KKK','AAA'],
'D':[np.nan,np.nan,np.nan,1,np.nan,np.nan,np.nan]})
print (df2)
C D
0 4 NaN
1 5 NaN
2 5 NaN
3 SSS 1.0
4 FFF NaN
5 KKK NaN
6 AAA NaN
df = df1[~(df1['A'].isin(df2['C']) | (df1['A'].isnull()))]
print (df)
A B
5 DDD NaN
If not necessary omit NaN
s if not exist in C
column:
df = df1[~df1['A'].isin(df2['C'])]
print (df)
A B
0 NaN NaN
1 NaN NaN
2 NaN ciao
5 DDD NaN
If exist NaN
s in both columns use second solution:
(input DataFrame
s are from question)
df = df1[~df1['A'].isin(df2['C'])]
print (df)
A B
5 DDD NaN