Property spark.yarn.jars - how to deal with it?

后端 未结 3 636
遥遥无期
遥遥无期 2020-12-12 23:12

My knowledge with Spark is limited and you would sense it after reading this question. I have just one node and spark, hadoop and yarn are installed on it.

I was abl

3条回答
  •  春和景丽
    2020-12-12 23:31

    You could also use the spark.yarn.archive option and set that to the location of an archive (you create) containing all the JARs in the $SPARK_HOME/jars/ folder, at the root level of the archive. For example:

    1. Create the archive: jar cv0f spark-libs.jar -C $SPARK_HOME/jars/ .
    2. Upload to HDFS: hdfs dfs -put spark-libs.jar /some/path/.
      2a. For a large cluster, increase the replication count of the Spark archive so that you reduce the amount of times a NodeManager will do a remote copy. hdfs dfs –setrep -w 10 hdfs:///some/path/spark-libs.jar (Change the amount of replicas proportional to the number of total NodeManagers)
    3. Set spark.yarn.archive to hdfs:///some/path/spark-libs.jar

提交回复
热议问题