Property spark.yarn.jars - how to deal with it?

后端未结

关注

 3  636

遥遥无期 2020-12-12 23:12

My knowledge with Spark is limited and you would sense it after reading this question. I have just one node and spark, hadoop and yarn are installed on it.

I was abl

3条回答

春和景丽 (楼主)

2020-12-12 23:31
You could also use the spark.yarn.archive option and set that to the location of an archive (you create) containing all the JARs in the $SPARK_HOME/jars/ folder, at the root level of the archive. For example:
1. Create the archive: jar cv0f spark-libs.jar -C $SPARK_HOME/jars/ .
2. Upload to HDFS: hdfs dfs -put spark-libs.jar /some/path/.
  2a. For a large cluster, increase the replication count of the Spark archive so that you reduce the amount of times a NodeManager will do a remote copy. hdfs dfs –setrep -w 10 hdfs:///some/path/spark-libs.jar (Change the amount of replicas proportional to the number of total NodeManagers)
3. Set spark.yarn.archive to hdfs:///some/path/spark-libs.jar
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...