发表新帖

发表新帖

save Spark dataframe to Hive: table not readable because “parquet not a SequenceFile”

前端未结

关注

 4  1248

走了就别回头了 2020-12-28 22:08

I\'d like to save data in a Spark (v 1.3.0) dataframe to a Hive table using PySpark.

The documentation states:

\"spark.sql.hive.convertMetasto

4条回答

抹茶落季 (楼主)

2020-12-28 22:21
I have done in pyspark, spark version 2.3.0 :

create empty table where we need to save/overwrite data like:
```
create table databaseName.NewTableName like databaseName.OldTableName;
```
then run below command:
```
df1.write.mode("overwrite").partitionBy("year","month","day").format("parquet").saveAsTable("databaseName.NewTableName");
```
The issue is you can't read this table with hive but you can read with spark.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题