How to convert rdd object to dataframe in spark

前端 未结 11 2331
慢半拍i
慢半拍i 2020-11-22 14:59

How can I convert an RDD (org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]) to a Dataframe org.apache.spark.sql.DataFrame. I converted a datafram

11条回答
  •  孤独总比滥情好
    2020-11-22 15:30

    Here is a simple example of converting your List into Spark RDD and then converting that Spark RDD into Dataframe.

    Please note that I have used Spark-shell's scala REPL to execute following code, Here sc is an instance of SparkContext which is implicitly available in Spark-shell. Hope it answer your question.

    scala> val numList = List(1,2,3,4,5)
    numList: List[Int] = List(1, 2, 3, 4, 5)
    
    scala> val numRDD = sc.parallelize(numList)
    numRDD: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[80] at parallelize at :28
    
    scala> val numDF = numRDD.toDF
    numDF: org.apache.spark.sql.DataFrame = [_1: int]
    
    scala> numDF.show
    +---+
    | _1|
    +---+
    |  1|
    |  2|
    |  3|
    |  4|
    |  5|
    +---+
    

提交回复
热议问题