问题
I had updated my CDH cluster to use spark 1.5.0
. When I submit spark application, the system show warning about spark.app.id
Using default name DAGScheduler for source because spark.app.id is not set.
I have searched about spark.app.id
but not document about it. I read this link and I think It is used for RestAPI call.
I don't see this warning in spark 1.4
. Could someone explain it to me and show how to set it?
回答1:
It's not necessarily used for the REST API, but rather for monitoring purpose e. g when you want to check yarn logs per example:
yarn logs <spark.app.id>
It's true that this specific issue is still not documented yet. I think it's been added to standardize the application deployment within the Hadoop ecosystem.
I suggest that you set the 'spark.app.id' in your app.
conf.set("spark.app.id", <app-id>) // considering that you already have a SparkConf defined of course
Nevertheless, this remains a warning which won't effect the application itself.
来源:https://stackoverflow.com/questions/32793276/spark-1-5-0-spark-app-id-warning