save Spark dataframe to Hive: table not readable because “parquet not a SequenceFile”

前端 未结 4 1248
走了就别回头了
走了就别回头了 2020-12-28 22:08

I\'d like to save data in a Spark (v 1.3.0) dataframe to a Hive table using PySpark.

The documentation states:

\"spark.sql.hive.convertMetasto

4条回答
  •  抹茶落季
    2020-12-28 22:21

    I have done in pyspark, spark version 2.3.0 :

    create empty table where we need to save/overwrite data like:

    create table databaseName.NewTableName like databaseName.OldTableName;
    

    then run below command:

    df1.write.mode("overwrite").partitionBy("year","month","day").format("parquet").saveAsTable("databaseName.NewTableName");
    

    The issue is you can't read this table with hive but you can read with spark.

提交回复
热议问题