How to use Spark SQL to parse the JSON array of objects

后端 未结 2 1203
慢半拍i
慢半拍i 2021-02-10 02:15

now has JSON data as follows

{\"Id\":11,\"data\":[{\"package\":\"com.browser1\",\"activetime\":60000},{\"package\":\"com.browser6\",\"activetime\":1205000},{\"pa         


        
2条回答
  •  轮回少年
    2021-02-10 02:20

    From your given json data you can view the schema of your dataframe with printSchema and use it

    appActiveTime.printSchema()
    root
     |-- data: array (nullable = true)
     |    |-- element: struct (containsNull = true)
     |    |    |-- activetime: long (nullable = true)
     |    |    |-- package: string (nullable = true)
    

    Since you have array you need to explode the data and select the struct field as below

    import org.apache.spark.sql.functions._
    appActiveTime.withColumn("data", explode($"data"))
           .select("data.*")
           .show(false)
    

    Output:

    +----------+------------+
    |activetime|     package|
    +----------+------------+
    |     60000|com.browser1|
    |   1205000|com.browser6|
    |   1205000|com.browser7|
    |     60000|com.browser1|
    |   1205000|com.browser6|
    +----------+------------+
    

    Hope this helps!

提交回复
热议问题