Spark configuration priority

强颜欢笑 提交于 2020-05-10 03:33:20

问题


Does there any difference or priority between specifying spark application configuration in the code :

SparkConf().setMaster(yarn)

and specifying them in command line

spark-submit --master yarn

回答1:


Yes, the highest priority is given to the configuration in the user's code with the set() function. After that there the flags passed with spark-submit.

Properties set directly on the SparkConf take highest precedence, then flags passed to spark-submit or spark-shell, then options in the spark-defaults.conf file. A few configuration keys have been renamed since earlier versions of Spark; in such cases, the older key names are still accepted, but take lower precedence than any instance of the newer key.

Source




回答2:


There are 4 precedence level: (1 to 4 , 1 being the highest priority):

  1. SparkConf set in the application
  2. Properties given with the spark-submit
  3. Properties can be given in a property file. And the property file can be given as argument while submission
  4. Default values



回答3:


Other than the priority, specifying it on a command-line would allow you to run on different cluster managers without modifying code. The same application can be run on local[n] or yarn or mesos or spark standalone cluster.



来源:https://stackoverflow.com/questions/36885680/spark-configuration-priority

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!