问题
I have simple spark app that reads master from a config file:
new SparkConf()
.setMaster(config.getString(SPARK_MASTER))
.setAppName(config.getString(SPARK_APPNAME))
What will happen when ill run my app with as follow:
spark-submit --class <main class> --master yarn <my jar>
Is my master going to be overwritten?
I prefer having the master provided in standard way so I don't need to maintain it in my configuration, but then the question how can I run this job directly from IDEA? this isn't my application argument but spark-submit argument.
Just for clarification my desired end product should:
when run in cluster using --master yarn, will use this configuration
when run from IDEA will run with local[*]
回答1:
- Do not set the master into your code.
- In production you could use the option
--master
ofspark-submit
which will tell spark which master to use (yarn in you case). also the value ofspark.master
in spark-defaults.conf file will do the job (priority is for--master
and then the property in configuration file) - In an IDEA... well I know in Eclipse you could pass a VM argument in Run Configuration
-Dspark.master=local[*]
for example (https://stackoverflow.com/a/24481688/1314742). In IDEA I think it is not too much different, you could check here to add VM options
来源:https://stackoverflow.com/questions/35902259/understanding-spark-master