Remove one dataframe from another with Pandas

前端 未结 5 969
挽巷
挽巷 2021-01-19 23:55

I have two dataframes of different size (df1 nad df2). I would like to remove from df1 all the rows which are stored within df2<

5条回答
  •  误落风尘
    2021-01-20 00:50

    pandas has a method called isin, however this relies on unique indices. We can define a lambda function to create columns we can use in this from the existing 'A' and 'B' of df1 and df2. We then negate this (as we want the values not in df2) and reset the index:

    import pandas as pd
    
    df1 = pd.DataFrame({'A' : ['qwe', 'wer', 'wer', 'rty', 'tyu', 'tyu', 'tyu', 'iop'],
                        'B' : [    5,     6,     6,     9,     7,     7,     7,     1],
                        'C' : ['a'  ,   's',   'd',   'f',   'g',   'h',   'j',   'k']})
    
    df2 = pd.DataFrame({'A' : ['wer', 'tyu'],
                        'B' : [    6,     7]})
    
    unique_ind = lambda df: df['A'].astype(str) + '_' + df['B'].astype(str)
    print df1[~unique_ind(df1).isin(unique_ind(df2))].reset_index(drop=True)
    

    printing:

         A  B  C
    0  qwe  5  a
    1  rty  9  f
    2  iop  1  k
    

提交回复
热议问题