spark-submit yarn-cluster with --jars does not work?

久未见 提交于 2019-12-05 18:48:52

According to help options from Spark Submit

  • --jars includes the local jars to include on the driver and executor classpaths. [it will just set the path]

  • ---files will copy the jars needed for you appication to run to all the working dir of executor nodes [it will transport your jar to
    working dir]

Note: This is similar to -file options in hadoop streaming , which transports the mapper/reducer scripts to slave nodes.

So try with --files options as well.

$ spark-submit --help
Options:
  --jars JARS                 Comma-separated list of local jars to include on the driver
                              and executor classpaths.
  --files FILES               Comma-separated list of files to be placed in the working
                              directory of each executor.

hope this helps

Neelesh Salian

Have you tried the solution posted in this thread: Spark on yarn jar upload problems

The problem was solved by copying spark-assembly.jar into a directory on the hdfs for each node and then passing it to spark-submit --conf spark.yarn.jar as a parameter. Commands are listed below:

hdfs dfs -copyFromLocal /var/tmp/spark/spark-1.4.0-bin-hadoop2.4/lib/spark-assembly-1.4.0-hadoop2.4.0.jar /user/spark/spark-assembly.jar 

/var/tmp/spark/spark-1.4.0-bin-hadoop2.4/bin/spark-submit --class MRContainer --master yarn-cluster  --conf spark.yarn.jar=hdfs:///user/spark/spark-assembly.jar simplemr.jar
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!