Check if pandas dataframe is subset of other dataframe

后端未结

关注

 3  1265

盖世英雄少女心 2021-01-11 11:49

I have two Python Pandas dataframes A, B, with the same columns (obviously with different data). I want to check A is a subset of B, that is, all rows of A are contained in

3条回答

半阙折子戏 (楼主)

2021-01-11 12:19
In the special case where you do not have any NaN/ None values, you can use np.in1d combined with np.stack and np.all:
```
df1 = pd.DataFrame(np.arange(16).reshape(4, 4))
df2 = pd.DataFrame(np.arange(40).reshape(10, 4))

res = np.stack([np.in1d(df1.values[i], df2.values) for i in range(df1.shape[0])]).all()
# True
```
This will not deal with duplicates, e.g. 2 identical rows in df1 may match with 1 row in df2. But it is not clear whether this is an issue.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...