I\'m trying to automatically include jars to my PySpark classpath. Right now I can type the following command and it works:
$ pyspark --jars /path/to/my.jar
You can add the jar files in the spark-defaults.conf file (located in the conf folder of your spark installation). If there is more than one entry in the jars list, use : as separator.
spark.driver.extraClassPath /path/to/my.jar
This property is documented in https://spark.apache.org/docs/1.3.1/configuration.html#runtime-environment
Recommended way since Spark 2.0+ is to use
spark.driver.extraLibraryPath
and spark.executor.extraLibraryPath
https://spark.apache.org/docs/2.4.3/configuration.html#runtime-environment
ps. spark.driver.extraClassPath
and spark.executor.extraClassPath
are still there,
but deprecated and will be removed in a future release of Spark.
As far as I know, you have to import jars to both driver AND executor. So, you need to edit conf/spark-defaults.conf
adding both lines below.
spark.driver.extraClassPath /path/to/my.jar
spark.executor.extraClassPath /path/to/my.jar
When I went through this, I did not need any other parameters. I guess you will not need them too.