How to convert a case-class-based RDD into a DataFrame?

青春壹個敷衍的年華 提交于 2019-12-05 01:53:23

All you need is just

val dogDF = sqlContext.createDataFrame(dogRDD)

Second parameter is part of Java API and expects you class follows java beans convention (getters/setters). Your case class doesn't follow this convention, so no property is detected, that leads to empty DataFrame with no columns.

David Griffin

You can create a DataFrame directly from a Seq of case class instances using toDF as follows:

val dogDf = Seq(Dog("Rex"), Dog("Fido")).toDF
Kamaldeep Singh

Case Class Approach won't Work in cluster mode. It'll give ClassNotFoundException to the case class you defined.

Convert it a RDD[Row] and define the schema of your RDD with StructField and then createDataFrame like

val rdd = data.map { attrs => Row(attrs(0),attrs(1)) }  

val rddStruct = new StructType(Array(StructField("id", StringType, nullable = true),StructField("pos", StringType, nullable = true)))

sqlContext.createDataFrame(rdd,rddStruct)

toDF() wont work either

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!