How to change a column position in a spark dataframe?

后端 未结 6 1042
无人共我
无人共我 2020-12-08 19:33

I was wondering if it is possible to change the position of a column in a dataframe, actually to change the schema?

Precisely if I have got a dataframe like [f

6条回答
  •  情歌与酒
    2020-12-08 20:04

    The spark-daria library has a reorderColumns method that makes it easy to reorder the columns in a DataFrame.

    import com.github.mrpowers.spark.daria.sql.DataFrameExt._
    
    val actualDF = sourceDF.reorderColumns(
      Seq("field1", "field3", "field2")
    )
    

    The reorderColumns method uses @Rockie Yang's solution under the hood.

    If you want to get the column ordering of df1 to equal the column ordering of df2, something like this should work better than hardcoding all the columns:

    df1.reorderColumns(df2.columns)
    

    The spark-daria library also defines a sortColumns transformation to sort columns in ascending or descending order (if you don't want to specify all the column in a sequence).

    import com.github.mrpowers.spark.daria.sql.transformations._
    
    df.transform(sortColumns("asc"))
    

提交回复
热议问题