How to obtain the symmetric difference between two DataFrames?

前端 未结 5 1000
借酒劲吻你
借酒劲吻你 2020-12-02 19:18

In the SparkSQL 1.6 API (scala) Dataframe has functions for intersect and except, but not one for difference. Obviously, a combination of union and

5条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-12-02 19:29

    If you are looking for Pyspark solution, you should use subtract() docs.

    Also, unionAll is deprecated in 2.0, use union() instead.

    df1.union(df2).subtract(df1.intersect(df2))

提交回复
热议问题