How to query JSON data column using Spark DataFrames?

后端 未结 5 888
梦毁少年i
梦毁少年i 2020-11-22 01:50

I have a Cassandra table that for simplicity looks something like:

key: text
jsonData: text
blobData: blob

I can create a basic data frame

5条回答
  •  春和景丽
    2020-11-22 02:24

    underlying JSON String is

    "{ \"column_name1\":\"value1\",\"column_name2\":\"value2\",\"column_name3\":\"value3\",\"column_name5\":\"value5\"}";
    

    Below is the script to filter the JSON and load the required data in to Cassandra.

      sqlContext.read.json(rdd).select("column_name1 or fields name in Json", "column_name2","column_name2")
                .write.format("org.apache.spark.sql.cassandra")
                .options(Map("table" -> "Table_name", "keyspace" -> "Key_Space_name"))
                .mode(SaveMode.Append)
                .save()
    

提交回复
热议问题