I\'d like to stop various messages that are coming on spark shell.
I tried to edit the log4j.properties
file in order to stop these message.
Her
Use below command to change log level while submitting application using spark-submit or spark-sql:
spark-submit \
--conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/log4j.xml" \
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:/log4j.xml"
Note: replace
where log4j
config file is stored.
Log4j.properties:
log4j.rootLogger=ERROR, console
# set the log level for these components
log4j.logger.com.test=DEBUG
log4j.logger.org=ERROR
log4j.logger.org.apache.spark=ERROR
log4j.logger.org.spark-project=ERROR
log4j.logger.org.apache.hadoop=ERROR
log4j.logger.io.netty=ERROR
log4j.logger.org.apache.zookeeper=ERROR
# add a ConsoleAppender to the logger stdout to write to the console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.layout=org.apache.log4j.PatternLayout
# use a simple message format
log4j.appender.console.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
log4j.xml
Switch to FileAppender in log4j.xml if you want to write logs to file instead of console. LOG_DIR
is a variable for logs directory which you can supply using spark-submit --conf "spark.driver.extraJavaOptions=-D
.
Another important thing to understand here is, when job is launched in distributed mode ( deploy-mode cluster and master as yarn or mesos) the log4j configuration file should exist on driver and worker nodes (log4j.configuration=file:
) else log4j init will complain-
log4j:ERROR Could not read configuration file [log4j.properties]. java.io.FileNotFoundException: log4j.properties (No such file or directory)
Hint on solving this problem-
Keep log4j config file in distributed file system(HDFS or mesos) and add external configuration using log4j PropertyConfigurator. or use sparkContext addFile to make it available on each node then use log4j PropertyConfigurator to reload configuration.