Calling JDBC to impala/hive from within a spark job and creating a table

丶灬走出姿态 提交于 2020-02-13 03:13:27

问题


I am trying to write a spark job in scala that would open a jdbc connection with Impala and let me create a table and perform other operations.

How do I do this? Any example would be of great help. Thank you!


回答1:


val JDBCDriver = "com.cloudera.impala.jdbc41.Driver"
val ConnectionURL = "jdbc:impala://url.server.net:21050/default;auth=noSasl"

Class.forName(JDBCDriver).newInstance
val con = DriverManager.getConnection(ConnectionURL)
val stmt = con.createStatement()
val rs = stmt.executeQuery(query)

val resultSetList = Iterator.continually((rs.next(), rs)).takeWhile(_._1).map(r => {
    getRowFromResultSet(r._2) // (ResultSet) => (spark.sql.Row)
}).toList

sc.parallelize(resultSetList)


来源:https://stackoverflow.com/questions/26634853/calling-jdbc-to-impala-hive-from-within-a-spark-job-and-creating-a-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!