Convert null values to empty array in Spark DataFrame

后端 未结 3 2097

I have a Spark data frame where one column is an array of integers. The column is nullable because it is coming from a left outer join. I want to convert all null values to

3条回答
  •  难免孤独
    2020-12-01 11:36

    With a slight modification to zero323's approach, I was able to do this without using a udf in Spark 2.3.1.

    val df = Seq("a" -> Array(1,2,3), "b" -> null, "c" -> Array(7,8,9)).toDF("id","numbers")
    df.show
    +---+---------+
    | id|  numbers|
    +---+---------+
    |  a|[1, 2, 3]|
    |  b|     null|
    |  c|[7, 8, 9]|
    +---+---------+
    
    val df2 = df.withColumn("numbers", coalesce($"numbers", array()))
    df2.show
    +---+---------+
    | id|  numbers|
    +---+---------+
    |  a|[1, 2, 3]|
    |  b|       []|
    |  c|[7, 8, 9]|
    +---+---------+
    

提交回复
热议问题