I am using spark 2.4.1 version and java8. I am trying to load external property file while submitting my spark job using spark-submit.
As I am using below TypeSafe to lo
With --files you should access the resource using SparkFiles.get as follows:
$ ./bin/spark-shell --files README.md
scala> import org.apache.spark._
import org.apache.spark._
scala> SparkFiles.get("README.md")
res0: String = /private/var/folders/0w/kb0d3rqn4zb9fcc91pxhgn8w0000gn/T/spark-f0b16df1-fba6-4462-b956-fc14ee6c675a/userFiles-eef6d900-cd79-4364-a4a2-dd177b4841d2/README.md
In other words, Spark will distribute the --files to executors, but the only way to know the path of the files is to use SparkFiles utility.
The other option would be to package all resource files into a jar file and bundle it together with the other jar files (either as a single uber-jar or simply as part of CLASSPATH of the Spark app) and use the following trick:
this.getClass.getClassLoader.getResourceAsStream(resourceFile)
With that, regardless of the jar file the resourceFile is in, as long as it's on the CLASSPATH, it should be available to the application.
I'm pretty sure any decent framework or library that uses resource files for configuration, e.g. Typesafe Config, accepts InputStream as the way to read resource files.
You could also include the --files as part of a jar file that is part of the CLASSPATH of the executors, but that'd be obviously less flexible (as every time you'd like to submit your Spark app with a different file, you'd have to recreate the jar).