I have some third party Database client libraries in Java. I want to access them through
java_gateway.py
E.g: to make the client class (not
java/scala libs from pyspark both --jars and spark.jars are not working in version 2.4.0 and earlier (I didn't check newer version). I'm surprised how many guys are claiming that it is working.
The main problem is that for classloader retrieved in following way:
jvm = SparkSession.builder.getOrCreate()._jvm
clazz = jvm.my.scala.class
# or
clazz = jvm.java.lang.Class.forName('my.scala.class')
it works only when you copy jar files to ${SPARK_HOME}/jars (this one works for me).
But when your only way is using --jars or spark.jars there is another classloader used (which is child class loader) which is set in current thread. So your python code needs to look like:
clazz = jvm.java.lang.Thread.currentThread().getContextClassLoader().loadClass(f"{object_name}$")
Hope it explains your troubles. Give me a shout if not.