Spark 2.0.x dump a csv file from a dataframe containing one array of type string

前端 未结 6 1637
难免孤独
难免孤独 2020-11-29 07:07

I have a dataframe df that contains one column of type array

df.show() looks like

|ID|ArrayOfString|Age|Gender|
+--+-------         


        
6条回答
  •  感动是毒
    2020-11-29 07:55

    Here is a method for converting all ArrayType (of any underlying type) columns of a DataFrame to StringType columns:

    def stringifyArrays(dataFrame: DataFrame): DataFrame = {
      val colsToStringify = dataFrame.schema.filter(p => p.dataType.typeName == "array").map(p => p.name)
    
      colsToStringify.foldLeft(dataFrame)((df, c) => {
        df.withColumn(c, concat(lit("["), concat_ws(", ", col(c).cast("array")), lit("]")))
      })
    }
    

    Also, it doesn't use a UDF.

提交回复
热议问题