SaveAsTable in Spark Scala: HDP3.x

不羁岁月 提交于 2020-05-17 06:08:08

问题


I have one dataframe in Spark I'm saving it in my hive as a table.But getting below error message.

    java.lang.RuntimeException:
    com.hortonworks.spark.sql.hive.llap.HiveWarehouseConnector
    does not allow create table as select.at scala.sys.package$.error(package.scala:27)

can anyone please help me how should i save this as table in hive.

    val df3 = df1.join(df2, df1("inv_num") === df2("inv_num")  // Join both dataframes on id column
    ).withColumn("finalSalary", when(df1("salary") < df2("salary"), df2("salary") - df1("salary")) 
    .otherwise(
    when(df1("salary") > df2("salary"), df1("salary") + df2("salary"))  // 5000+3000=8000  check
    .otherwise(df2("salary"))))    // insert from second dataframe
    .drop(df1("salary"))
    .drop(df2("salary"))
    .withColumnRenamed("finalSalary","salary")

    }
    }

    //below code is not working when I'm executing below command its throwing error as 

    java.lang.RuntimeException:
    com.hortonworks.spark.sql.hive.llap.HiveWarehouseConnector
    does not allow create table as select.at scala.sys.package$.error(package.scala:27)

     df3.write.
     format("com.hortonworks.spark.sql.hive.llap.HiveWarehouseConnector")
    .option("database",  "dbname")
    .option("table", "tablename")
    .mode("Append")
    .saveAsTable("tablename")

Note: Table is already available in database and I m using HDP 3.x.


回答1:


According to the spark documentation the behaviour of the saveAsTable function changes with the mode used, by default is ErrofIfExist. In your case, that you are using Hive, try with insertInto, but keep in mind that the order of the columns of the dataframe must be the same as the destiny.




回答2:


Try registerTempTable and then -> spark.sql() -> then write

df3.registerTempTable("tablename");
spark.sql("SELECT salary FROM tablename")
.write.format(HIVE_WAREHOUSE_CONNECTOR)
.option("database",  "dbname")
    .option("table", "tablename")
    .mode("Append")
.option("table", "newTable")
.save()




回答3:


See if below solution works for you,

val df3 = df1.join(df2, df1("inv_num") === df2("inv_num")  // Join both dataframes on id column
    ).withColumn("finalSalary", when(df1("salary") < df2("salary"), df2("salary") - df1("salary")) 
    .otherwise(
    when(df1("salary") > df2("salary"), df1("salary") + df2("salary"))  // 5000+3000=8000  check
    .otherwise(df2("salary"))))    // insert from second dataframe
    .drop(df1("salary"))
    .drop(df2("salary"))
    .withColumnRenamed("finalSalary","salary")

val hive = com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.session(spark).build()

df3.createOrReplaceTempView("<temp-tbl-name>")
hive.setDatabase("<db-name>")
hive.createTable("<tbl-name>")
.ifNotExists()

sql("SELECT salary FROM <temp-tbl-name>")
.write
.format(HIVE_WAREHOUSE_CONNECTOR)
.mode("append")
.option("table", "<tbl-name>")
.save()     


来源:https://stackoverflow.com/questions/61819955/saveastable-in-spark-scala-hdp3-x

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!