How to pass environment variables to spark driver in cluster mode with spark-submit

后端 未结 5 1715
有刺的猬
有刺的猬 2021-01-01 18:33

spark-submit allows to configure the executor environment variables with --conf spark.executorEnv.FOO=bar, and the Spark REST API allows to pass so

5条回答
  •  醉酒成梦
    2021-01-01 18:58

    Yes, That is possible. What are the variables you need you could post that in spark-submit like you're doing?

    spark-submit --deploy-mode cluster myapp.jar
    

    Take variables from http://spark.apache.org/docs/latest/configuration.html and depends on your optimization use these. This link could also be helpful.

    I used to use in cluster mode but now I'm using in YARN so my variables are as follows: (Hopefully helpful)

    hastimal@nm:/usr/local/spark$ ./bin/spark-submit --class  com.hastimal.Processing  --master yarn-cluster  --num-executors 15 --executor-memory 52g --executor-cores 7 --driver-memory 52g  --driver-cores 7 --conf spark.default.parallelism=105 --conf spark.driver.maxResultSize=4g --conf spark.network.timeout=300  --conf spark.yarn.executor.memoryOverhead=4608 --conf spark.yarn.driver.memoryOverhead=4608 --conf spark.akka.frameSize=1200  --conf spark.io.compression.codec=lz4 --conf spark.rdd.compress=true --conf spark.broadcast.compress=true --conf spark.shuffle.spill.compress=true --conf spark.shuffle.compress=true --conf spark.shuffle.manager=sort /users/hastimal/Processing.jar Main_Class /inputRDF/rdf_data_all.nt /output /users/hastimal/ /users/hastimal/query.txt index 2
    

    In this, my jar following are arguments of class.

    cc /inputData/data_all.txt /output /users/hastimal/ /users/hastimal/query.txt index 2

提交回复
热议问题