AWS Redshift driver in Zeppelin

耗尽温柔 提交于 2019-12-24 17:17:03

问题


I want to explore my data in Redshift using notebook Zeppelin. A small EMR cluster with Spark is running behind. I am loading databricks' spark-redshift library

%dep
z.reset()
z.load("com.databricks:spark-redshift_2.10:0.6.0")

and then

import org.apache.spark.sql.DataFrame

val query = "..."

val url = "..."
val port=5439
val table = "..."
val database = "..."
val user = "..."
val password = "..."

val df: DataFrame = sqlContext.read
  .format("com.databricks.spark.redshift")
  .option("url", s"jdbc:redshift://${url}:$port/$database?user=$user&password=$password")
  .option("query",query)
  .option("tempdir", "s3n://.../tmp/data")
  .load()

df.show

but I get the error

java.lang.ClassNotFoundException: Could not load an Amazon Redshift JDBC driver; see the README for instructions on downloading and configuring the official Amazon driver

I added option

option("jdbcdriver", "com.amazon.redshift.jdbc41.Driver")

but not for the better. I think I need to specify redshift's JDBC driver somewhere like I would passing --driver-class-path to spark-shell, but how to do that with zeppelin?


回答1:


You can add external jars with dependencies like the JDBC driver using either Zeppelin's dependency-loading mechanism or, in case of Spark, using %dep dynamic dependency loader

When your code requires external library, instead of doing download/copy/restart Zeppelin, you can easily do following jobs using %dep interpreter.

  • Load libraries recursively from Maven repository
  • Load libraries from local filesystem
  • Add additional maven repository
  • Automatically add libraries to SparkCluster (You can turn off)

The latter would look something like:

%dep
// loads with all transitive dependencies from Maven repo
z.load("groupId:artifactId:version")

// or add artifact from filesystem
z.load("/path/to.jar")

and by convention have to be in the first paragraph of the note.



来源:https://stackoverflow.com/questions/36745618/aws-redshift-driver-in-zeppelin

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!