Check if pandas dataframe is subset of other dataframe

后端 未结 3 1265
盖世英雄少女心
盖世英雄少女心 2021-01-11 11:49

I have two Python Pandas dataframes A, B, with the same columns (obviously with different data). I want to check A is a subset of B, that is, all rows of A are contained in

3条回答
  •  半阙折子戏
    2021-01-11 12:19

    In the special case where you do not have any NaN/ None values, you can use np.in1d combined with np.stack and np.all:

    df1 = pd.DataFrame(np.arange(16).reshape(4, 4))
    df2 = pd.DataFrame(np.arange(40).reshape(10, 4))
    
    res = np.stack([np.in1d(df1.values[i], df2.values) for i in range(df1.shape[0])]).all()
    # True
    

    This will not deal with duplicates, e.g. 2 identical rows in df1 may match with 1 row in df2. But it is not clear whether this is an issue.

提交回复
热议问题