What is going wrong with `unionAll` of Spark `DataFrame`?

前端 未结 5 628
误落风尘
误落风尘 2020-11-29 07:19

Using Spark 1.5.0 and given the following code, I expect unionAll to union DataFrames based on their column name. In the code, I\'m using some FunSuite for pass

5条回答
  •  失恋的感觉
    2020-11-29 08:10

    Use unionByName:

    Excerpt from the documentation:

    def unionByName(other: Dataset[T]): Dataset[T]

    The difference between this function and union is that this function resolves columns by name (not by position):

    val df1 = Seq((1, 2, 3)).toDF("col0", "col1", "col2")
    val df2 = Seq((4, 5, 6)).toDF("col1", "col2", "col0")
    df1.union(df2).show
    
    // output:
    // +----+----+----+
    // |col0|col1|col2|
    // +----+----+----+
    // |   1|   2|   3|
    // |   4|   5|   6|
    // +----+----+----+
    

提交回复
热议问题