spark-submit, how to specify log4j.properties

前端 未结 6 2049
旧巷少年郎
旧巷少年郎 2021-02-02 00:04

In spark-submit, how to specify log4j.properties ?

Here is my script. I have tried all of combinations and even just use one local node. but looks like the log4j.propert

6条回答
  •  天命终不由人
    2021-02-02 00:38

    Solution for spark-on-yarn

    for me, run spark on yarn,just add --files log4j.properties makes everything ok.
    1. make sure the directory where you run spark-submit contains file "log4j.properties".
    2. run spark-submit ... --files log4j.properties

    let's see why this work

    1.spark-submit will upload log4j.properties to hdfs like this

    20/03/31 01:22:51 INFO Client: Uploading resource file:/home/ssd/homework/shaofengfeng/tmp/firesparkl-1.0/log4j.properties -> hdfs://sandbox/user/homework/.sparkStaging/application_1580522585397_2668/log4j.properties
    

    2.when yarn launches containers for driver or executor,yarn will download all files uploaded into node's local file cache, including files under ${spark_home}/jars,${spark_home}/conf and ${hadoop_conf_dir} and files specified by --jars and --files.
    3.before launcher container, yarn export classpath and make soft links like this

    export CLASSPATH="$PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*
    
    ln -sf "/var/hadoop/yarn/local/usercache/homework/filecache/1484419/log4j.properties" "log4j.properties"
    hadoop_shell_errorcode=$?
    if [ $hadoop_shell_errorcode -ne 0 ]
    then
      exit $hadoop_shell_errorcode
    fi
    ln -sf "/var/hadoop/yarn/local/usercache/homework/filecache/1484440/apache-log4j-extras-1.2.17.jar" "apache-log4j-extras-1.2.17.jar"
    

    4.after step3, "log4.properties" is already in CLASSPATH, no need for setting spark.driver.extraJavaOptions or spark.executor.extraJavaOption.

提交回复
热议问题