Save Spark dataframe as dynamic partitioned table in Hive

后端未结

关注

 6  994

不要未来只要你来 2020-12-02 09:25

I have a sample application working to read from csv files into a dataframe. The dataframe can be stored to a Hive table in parquet format using the method df.sav

6条回答盖世英雄少女心 (楼主) 2020-12-02 09:49 it can be configured on SparkSession in that way: spark = SparkSession \ .builder \ ... .config("spark.hadoop.hive.exec.dynamic.partition", "true") \ .config("spark.hadoop.hive.exec.dynamic.partition.mode", "nonstrict") \ .enableHiveSupport() \ .getOrCreate() or you can add them to .properties file the spark.hadoop prefix is needed by Spark config (at least in 2.4) and here is how Spark sets this config: /** * Appends spark.hadoop.* configurations from a [[SparkConf]] to a Hadoop * configuration without the spark.hadoop. prefix. */ def appendSparkHadoopConfigs(conf: SparkConf, hadoopConf: Configuration): Unit = { SparkHadoopUtil.appendSparkHadoopConfigs(conf, hadoopConf) } 0 讨论(0) 查看其它6个回答发布评论: 提交评论加载中... 验证码看不清? 提交回复