How to connect Spark SQL to remote Hive metastore (via thrift protocol) with no hive-site.xml?

后端 未结 8 1356
我寻月下人不归
我寻月下人不归 2020-11-22 12:07

I\'m using HiveContext with SparkSQL and I\'m trying to connect to a remote Hive metastore, the only way to set the hive metastore is through including the hive-site.xml on

8条回答
  •  攒了一身酷
    2020-11-22 13:06

    In Hadoop 3 Spark and Hive catalogs are separated so:

    For spark-shell (it comes with .enableHiveSupport() by default) just try:

    pyspark-shell --conf spark.hadoop.metastore.catalog.default=hive
    

    For spark-submit job create you spark session like this:

    SparkSession.builder.appName("Test").enableHiveSupport().getOrCreate()
    

    then add this conf on your spark-submit command:

    --conf spark.hadoop.metastore.catalog.default=hive
    

    But for ORC table(and more generally internal table) it is recommended to use HiveWareHouse Connector.

提交回复
热议问题