Removing duplicate columns after a DF join in Spark

后端 未结 7 683
小鲜肉
小鲜肉 2020-12-24 05:46

When you join two DFs with similar column names:

df = df1.join(df2, df1[\'id\'] == df2[\'id\'])

Join works fine but you can\'t call the

7条回答
  •  情话喂你
    2020-12-24 06:08

    In pyspark, you can join on multiple columns as per below

    df = df1.join(df2, ['each', 'shared', 'col'], how='full')
    

    Original answer from: How to perform union on two DataFrames with different amounts of columns in spark?

提交回复
热议问题