Dataframe from List in Java

后端 未结 3 780
盖世英雄少女心
盖世英雄少女心 2021-01-15 17:23
  • Spark Version : 1.6.2
  • Java Version: 7

I have a List data. Something like:

[[dev, engg, 10000], [kar         


        
3条回答
  •  长情又很酷
    2021-01-15 17:39

    Task can be completed without JSON, on Scala:

    val data = List("dev, engg, 10000", "karthik, engg, 20000")
    val intialRdd = sparkContext.parallelize(data)
    val splittedRDD = intialRdd.map(current => {
      val array = current.split(",")
      (array(0), array(1), array(2))
    })
    import sqlContext.implicits._
    val dataframe = splittedRDD.toDF("name", "degree", "salary")
    dataframe.show()
    

    Output is:

    +-------+------+------+
    |   name|degree|salary|
    +-------+------+------+
    |    dev|  engg| 10000|
    |karthik|  engg| 20000|
    +-------+------+------+
    

    Note: (array(0), array(1), array(2)) is a Scala Tuple

提交回复
热议问题