pandas get rows which are NOT in other dataframe

后端 未结 13 1141
春和景丽
春和景丽 2020-11-22 02:17

I\'ve two pandas data frames which have some rows in common.

Suppose dataframe2 is a subset of dataframe1.

How can I get the rows of dataframe1 which

13条回答
  •  萌比男神i
    2020-11-22 03:02

    One method would be to store the result of an inner merge form both dfs, then we can simply select the rows when one column's values are not in this common:

    In [119]:
    
    common = df1.merge(df2,on=['col1','col2'])
    print(common)
    df1[(~df1.col1.isin(common.col1))&(~df1.col2.isin(common.col2))]
       col1  col2
    0     1    10
    1     2    11
    2     3    12
    Out[119]:
       col1  col2
    3     4    13
    4     5    14
    

    EDIT

    Another method as you've found is to use isin which will produce NaN rows which you can drop:

    In [138]:
    
    df1[~df1.isin(df2)].dropna()
    Out[138]:
       col1  col2
    3     4    13
    4     5    14
    

    However if df2 does not start rows in the same manner then this won't work:

    df2 = pd.DataFrame(data = {'col1' : [2, 3,4], 'col2' : [11, 12,13]})
    

    will produce the entire df:

    In [140]:
    
    df1[~df1.isin(df2)].dropna()
    Out[140]:
       col1  col2
    0     1    10
    1     2    11
    2     3    12
    3     4    13
    4     5    14
    

提交回复
热议问题