Error while exploding a struct column in Spark

独自空忆成欢 提交于 2019-12-04 03:26:47

问题


I have a dataframe whose schema looks like this:

event: struct (nullable = true)
|    | event_category: string (nullable = true)
|    | event_name: string (nullable = true)
|    | properties: struct (nullable = true)
|    |    | ErrorCode: string (nullable = true)
|    |    | ErrorDescription: string (nullable = true)

I am trying to explode the struct column properties using the following code:

df_json.withColumn("event_properties", explode($"event.properties"))

But it is throwing the following exception:

cannot resolve 'explode(`event`.`properties`)' due to data type mismatch: 
input to function explode should be array or map type, 
not StructType(StructField(IDFA,StringType,true),

How to explode the column properties?


回答1:


You can use explode in an array or map columns so you need to convert the properties struct to array and then apply the explode function as below

import org.apache.spark.sql.functions._
df_json.withColumn("event_properties", explode(array($"event.properties.*"))).show(false)

You should have your desired requirement




回答2:


as the error message says, you can only explode array or map types, not struct type columns.

You can just do

df_json.withColumn("event_properties", $"event.properties")

This will generate a new column event_properties, which is also of struct-type

If you want to convert every element of the struct to a new column, then you cannot use withColumn, you need to do a select with a wildcard *:

df_json.select($"event.properties.*")



回答3:


You may use following to flatten the struct. Explode does not work for struct as error message states.

val explodeDF = parquetDF.explode($"event") { 
case Row(properties: Seq[Row]) => properties.map{ property =>
  val errorCode = property(0).asInstanceOf[String]
  val errorDescription = property(1).asInstanceOf[String]
  Event(errorCode, errorDescription, email, salary)
 }
}.cache()
display(explodeDF)


来源:https://stackoverflow.com/questions/48315442/error-while-exploding-a-struct-column-in-spark

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!