How to pass environment variables to spark driver in cluster mode with spark-submit

后端未结

关注

 5  1715

有刺的猬 2021-01-01 18:33

spark-submit allows to configure the executor environment variables with --conf spark.executorEnv.FOO=bar, and the Spark REST API allows to pass so

5条回答

醉酒成梦 (楼主)

2021-01-01 18:58

Yes, That is possible. What are the variables you need you could post that in spark-submit like you're doing?

spark-submit --deploy-mode cluster myapp.jar

Take variables from http://spark.apache.org/docs/latest/configuration.html and depends on your optimization use these. This link could also be helpful.

I used to use in cluster mode but now I'm using in YARN so my variables are as follows: (Hopefully helpful)

hastimal@nm:/usr/local/spark$ ./bin/spark-submit --class  com.hastimal.Processing  --master yarn-cluster  --num-executors 15 --executor-memory 52g --executor-cores 7 --driver-memory 52g  --driver-cores 7 --conf spark.default.parallelism=105 --conf spark.driver.maxResultSize=4g --conf spark.network.timeout=300  --conf spark.yarn.executor.memoryOverhead=4608 --conf spark.yarn.driver.memoryOverhead=4608 --conf spark.akka.frameSize=1200  --conf spark.io.compression.codec=lz4 --conf spark.rdd.compress=true --conf spark.broadcast.compress=true --conf spark.shuffle.spill.compress=true --conf spark.shuffle.compress=true --conf spark.shuffle.manager=sort /users/hastimal/Processing.jar Main_Class /inputRDF/rdf_data_all.nt /output /users/hastimal/ /users/hastimal/query.txt index 2

In this, my jar following are arguments of class.

cc /inputData/data_all.txt /output /users/hastimal/ /users/hastimal/query.txt index 2

0 讨论(0)

查看其它5个回答