发表新帖

发表新帖

Pandas populate new dataframe column based on matching columns in another dataframe

前端未结

关注

 5  1681

花落未央 2020-12-25 14:15

I have a df which contains my main data which has one million rows. My main data also has 30 columns. Now I want to add another column

5条回答

无人及你 (楼主)

2020-12-25 14:57
APPROACH 1:

You could use concat instead and drop the duplicated values present in both Index and AUTHOR_NAME columns combined. After that, use isin for checking membership:
```
df_concat = pd.concat([df2, df]).reset_index().drop_duplicates(['Index', 'AUTHOR_NAME'])
df_concat.set_index('Index', inplace=True)
df_concat[df_concat.index.isin(df.index)]
```
Note: The column Index is assumed to be set as the index column for both the DF's.

APPROACH 2:

Use join after setting the index column correctly as shown:
```
df2.set_index(['Index', 'AUTHOR_NAME'], inplace=True)
df.set_index(['Index', 'AUTHOR_NAME'], inplace=True)

df.join(df2).reset_index()
```
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...

热议问题