I\'d like to save data in a Spark (v 1.3.0) dataframe to a Hive table using PySpark.
The documentation states:
\"spark.sql.hive.convertMetasto
I've been there...
The API is kinda misleading on this one.
DataFrame.saveAsTable
does not create a Hive table, but an internal Spark table source.
It also stores something into Hive metastore, but not what you intend.
This remark was made by spark-user mailing list regarding Spark 1.3.
If you wish to create a Hive table from Spark, you can use this approach:
1. Use Create Table ...
via SparkSQL for Hive metastore.
2. Use DataFrame.insertInto(tableName, overwriteMode)
for the actual data (Spark 1.3)