How to add configuration file to classpath of all Spark executors in Spark 1.2.0?

和自甴很熟 提交于 2019-12-03 16:29:56

问题


I'm using Typesafe Config, https://github.com/typesafehub/config, to parameterize a Spark job running in yarn-cluster mode with a configuration file. The default behavior of Typesafe Config is to search the classpath for resources with names matching a regex and to load them into your configuration class automatically with ConfigFactory.load() (for our purposes, assume the file it looks for is called application.conf).

I am able to load the configuration file into the driver using --driver-class-path <directory containing configuration file>, but using --conf spark.executor.extraClassPath=<directory containing configuration file> does not put the resource on the classpath of all executors like it should. The executors report that they can not find a certain configuration setting for a key that does exist in the configuration file that I'm attempting to add to their classpaths.

What is the correct way to add a file to the classpaths of all executor JVMs using Spark?


回答1:


It looks like the value of the spark.executor.extraClassPath property is relative to the working directory of the application ON THE EXECUTOR.

So, to use this property correctly, one should use --files <configuration file> to first direct Spark to copy the file to the working directory of all executors, then use spark.executor.extraClassPath=./ to add the executor's working directory to its classpath. This combination results in the executor being able to read values from the configuration file.




回答2:


I use the SparkContext addFile method

val sc: SparkContext = {
  val sparkConf = new SparkConf()
    .set("spark.storage.memoryFraction", "0.3")
    .set("spark.driver.maxResultSize", "10g")
    .set("spark.default.parallelism", "1024")
    .setAppName("myproject")
  sparkConf.setMaster(getMaster)
  val sc = new SparkContext(sparkConf)
  sc.addFile("application.conf")
  sc
}


来源:https://stackoverflow.com/questions/31707722/how-to-add-configuration-file-to-classpath-of-all-spark-executors-in-spark-1-2-0

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!