How to extract values from json string?

后端 未结 2 1093
生来不讨喜
生来不讨喜 2021-01-03 05:49

I have a file which has bunch of columns and one column called jsonstring is of string type which has json strings in it… let\'s say the format is the following

2条回答
  •  我在风中等你
    2021-01-03 06:37

    You can use withColumn + udf + json4s:

    import org.json4s.{DefaultFormats, MappingException}
    import org.json4s.jackson.JsonMethods._
    import org.apache.spark.sql.functions._
    
    def getJsonContent(jsonstring: String): (String, String) = {
        implicit val formats = DefaultFormats
        val parsedJson = parse(jsonstring)  
        val value1 = (parsedJson \ "key1").extract[String]
        val level2value1 = (parsedJson \ "key2" \ "level2key1").extract[String]
        (value1, level2value1)
    }
    val getJsonContentUDF = udf((jsonstring: String) => getJsonContent(jsonstring))
    
    df.withColumn("parsedJson", getJsonContentUDF(df("jsonstring")))
    

提交回复
热议问题