Find different rows between 2 dataframes of different size with Pandas

前端 未结 1 1238
栀梦
栀梦 2020-12-22 03:42

I have 2 dataframes df1 and df2 of different size.

df1 = pd.DataFrame({\'A\':[np.nan, np.nan, np.nan, \'AAA\',\'SSS\',\'DDD\'], \'B\':[np.nan,np.nan,\'ciao\'         


        
相关标签:
1条回答
  • 2020-12-22 04:32

    I believe need isin withboolean indexing :

    Also omit NaNs rows by default chain new condition:

    #changed df2 with no NaN in C column
    df2 = pd.DataFrame({'C':[4, 5, 5, 'SSS','FFF','KKK','AAA'], 
                        'D':[np.nan,np.nan,np.nan,1,np.nan,np.nan,np.nan]})
    print (df2)
         C    D
    0    4  NaN
    1    5  NaN
    2    5  NaN
    3  SSS  1.0
    4  FFF  NaN
    5  KKK  NaN
    6  AAA  NaN
    
    df = df1[~(df1['A'].isin(df2['C']) | (df1['A'].isnull()))]
    print (df)
         A    B
    5  DDD  NaN
    

    If not necessary omit NaNs if not exist in C column:

    df = df1[~df1['A'].isin(df2['C'])]
    print (df)
         A     B
    0  NaN   NaN
    1  NaN   NaN
    2  NaN  ciao
    5  DDD   NaN
    

    If exist NaNs in both columns use second solution:

    (input DataFrames are from question)

    df = df1[~df1['A'].isin(df2['C'])]
    print (df)
         A    B
    5  DDD  NaN
    
    0 讨论(0)
提交回复
热议问题