How to convert a case-class-based RDD into a DataFrame?

雨燕双飞 提交于 2019-12-10 02:08:23

问题


The Spark documentation shows how to create a DataFrame from an RDD, using Scala case classes to infer a schema. I am trying to reproduce this concept using sqlContext.createDataFrame(RDD, CaseClass), but my DataFrame ends up empty. Here's my Scala code:

// sc is the SparkContext, while sqlContext is the SQLContext.

// Define the case class and raw data
case class Dog(name: String)
val data = Array(
    Dog("Rex"),
    Dog("Fido")
)

// Create an RDD from the raw data
val dogRDD = sc.parallelize(data)

// Print the RDD for debugging (this works, shows 2 dogs)
dogRDD.collect().foreach(println)

// Create a DataFrame from the RDD
val dogDF = sqlContext.createDataFrame(dogRDD, classOf[Dog])

// Print the DataFrame for debugging (this fails, shows 0 dogs)
dogDF.show()

The output I'm seeing is:

Dog(Rex)
Dog(Fido)
++
||
++
||
||
++

What am I missing?

Thanks!


回答1:


All you need is just

val dogDF = sqlContext.createDataFrame(dogRDD)

Second parameter is part of Java API and expects you class follows java beans convention (getters/setters). Your case class doesn't follow this convention, so no property is detected, that leads to empty DataFrame with no columns.




回答2:


You can create a DataFrame directly from a Seq of case class instances using toDF as follows:

val dogDf = Seq(Dog("Rex"), Dog("Fido")).toDF



回答3:


Case Class Approach won't Work in cluster mode. It'll give ClassNotFoundException to the case class you defined.

Convert it a RDD[Row] and define the schema of your RDD with StructField and then createDataFrame like

val rdd = data.map { attrs => Row(attrs(0),attrs(1)) }  

val rddStruct = new StructType(Array(StructField("id", StringType, nullable = true),StructField("pos", StringType, nullable = true)))

sqlContext.createDataFrame(rdd,rddStruct)

toDF() wont work either



来源:https://stackoverflow.com/questions/37004352/how-to-convert-a-case-class-based-rdd-into-a-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!