Spark Structtype for coalesce

房东的猫 提交于 2020-01-14 14:40:51

问题


I use Spark 2.0.1 Scala 2.11

How to provide a default value using coalesce for a column that's a StructType?

Say ...

val ss = new StructType().add("x", IntegerType).add("y", IntegerType)

val s = new StructType()
    .add("a", IntegerType)
    .add("b", ss)

val d = Seq( Row(1, Row(1,2)), Row(2, Row(2,3)), Row(2, null) ) 

val rd = sc.parallelize(d)
val df = spark.createDataFrame(rd, s)

Now, df.select($"b").show results in

+-----+
| b   |
+-----+
|[1,2]|
|[2,3]|
| null|
+-----+

My question is how can I provide a default value (say [0,0]) using coalesce?


回答1:


You can use the struct function, passing two lit(0) values named to match the names of the struct you already have:

df.select(coalesce($"b", struct(lit(0).as("x"), lit(0).as("y"))))
  .show()

// +---------------------------------------+
// |coalesce(b, struct(0 AS `x`, 0 AS `y`))|
// +---------------------------------------+
// |                                  [1,2]|
// |                                  [2,3]|
// |                                  [0,0]|
// +---------------------------------------+


来源:https://stackoverflow.com/questions/44377095/spark-structtype-for-coalesce

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!