I have some third party Database client libraries in Java. I want to access them through
java_gateway.py
E.g: to make the client class (not
All the above answers did not work for me
What I had to do with pyspark was
pyspark --py-files /path/to/jar/xxxx.jar
For Jupyter Notebook:
spark = (SparkSession
.builder
.appName("Spark_Test")
.master('yarn-client')
.config("spark.sql.warehouse.dir", "/user/hive/warehouse")
.config("spark.executor.cores", "4")
.config("spark.executor.instances", "2")
.config("spark.sql.shuffle.partitions","8")
.enableHiveSupport()
.getOrCreate())
# Do this
spark.sparkContext.addPyFile("/path/to/jar/xxxx.jar")
Link to the source where I found it: https://github.com/graphframes/graphframes/issues/104