Pandas compare two dataframes and remove what matches in one column

前端 未结 3 1374
情话喂你
情话喂你 2020-12-15 06:37

I have two separate pandas dataframes (df1 and df2) which have multiple columns, but only one in common (\'text\').

I would like to do fin

相关标签:
3条回答
  • 2020-12-15 07:06

    You can merge them and keep only the lines that have a NaN.

    df2[pd.merge(df1, df2, how='outer').isnull().any(axis=1)]
    

    or you can use isin:

    df2[~df2.text.isin(df1.text)]
    
    0 讨论(0)
  • 2020-12-15 07:13

    As you asked, you can do this efficiently using isin (without resorting to expensive merges).

    >>> df2[~df2.text.isin(df1.text.values)]
    C   D   text
    0   0.5 2   shot
    1   0.3 2   shot
    
    0 讨论(0)
  • 2020-12-15 07:20

    EDIT:

    import numpy as np
    
    mergeddf = pd.merge(df2,df1, how="left")
    
    result = mergeddf[(np.isnan(mergeddf['A']))][['C','D','text']]
    
    0 讨论(0)
提交回复
热议问题