Spark RDD to DataFrame python
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I am trying to convert the Spark RDD to a DataFrame. I have seen the documentation and example where the scheme is passed to sqlContext.CreateDataFrame(rdd,schema) function. But I have 38 columns or fields and this will increase further. If I manually give the schema specifying each field information, that it going to be so tedious job. Is there any other way to specify the schema without knowing the information of the columns prior. 回答1: See, There are two ways to convert an RDD to DF in Spark. toDF() and createDataFrame(rdd, schema) I will