发表新帖

发表新帖

How to update a pyspark dataframe with new values from another dataframe?

前端未结

关注

 3  1474

迷失自我 2020-12-19 16:04

I have two spark dataframes:

Dataframe A:

|col_1 | col_2 | ... | col_n |
|val_1 | val_2 | ... | val_n |

and dataframe B:

3条回答

暗喜 (楼主)

2020-12-19 16:49
If you want to keep only unique values, and require strictly correct results, then union followed by dropDupilcates should do the trick:
```
columns_which_dont_change = [...]
old_df.union(new_df).dropDuplicates(subset=columns_which_dont_change)
```
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题