Reading DataFrame from partitioned parquet file

后端 未结 3 1093
离开以前
离开以前 2020-12-04 18:10

How to read partitioned parquet with condition as dataframe,

this works fine,

val dataframe = sqlContext.read.parquet(\"file:///home/msoproj/dev_data         


        
3条回答
  •  天命终不由人
    2020-12-04 18:46

    you need to provide mergeSchema = true option. like mentioned below (this is from 1.6.0):

    val dataframe = sqlContext.read.option("mergeSchema", "true").parquet("file:///your/path/data=jDD")
    

    This will read all the parquet files into dataframe and also creates columns year, month and day in the dataframe data.

    Ref: https://spark.apache.org/docs/1.6.0/sql-programming-guide.html#schema-merging

提交回复
热议问题