Spark:executor.CoarseGrainedExecutorBackend: Driver Disassociated disassociated

问题

I am learning how to use spark and I have a simple program.When I run the jar file it gives me the right result but I have some error in the stderr file.just like this:

 15/05/18 18:19:52 ERROR executor.CoarseGrainedExecutorBackend: Driver   Disassociated [akka.tcp://sparkExecutor@localhost:51976] -> [akka.tcp://sparkDriver@172.31.34.148:60060] disassociated! Shutting down.
 15/05/18 18:19:52 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkDriver@172.31.34.148:60060] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].

You can get the whole stderr file in there:

http://172.31.34.148:8081/logPage/?appId=app-20150518181945-0026&executorId=0&logType=stderr

I searched this problem and find this:

Why spark application fail with "executor.CoarseGrainedExecutorBackend: Driver Disassociated"?

And I turn up the spark.yarn.executor.memoryOverhead as it said but it doesn't work.

I just have one master node(8G memory) and in the spark's slaves file there is only one slave node--the master itself.I submit like this:

./bin/spark-submit --class .... --master spark://master:7077 --executor-memory 6G --total-executor-cores 8 /path/..jar hdfs://myfile

I don't know what is the executor and what is the driver...lol... sorry about that..

anybody help me?

回答1:

If Spark Driver fails, it gets disassociated (from YARN AM). Try the following to make it more fault-tolerant:

spark-submit with --supervise flag on Spark Standalone cluster
yarn-cluster mode on YARN
spark.yarn.driver.memoryOverhead parameter for increasing Driver's memory allocation on YARN

Note: Driver supervisation (spark.driver.supervise) is not supported on a YARN cluster (yet).

回答2:

An overview of driver vs. executor (and others) can be found at http://spark.apache.org/docs/latest/cluster-overview.html or https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-architecture.html

They are java processes that could run in different or the same machine depending on your configuration. Driver contains the SparkContext, declares the RDD transformation (and if I'm not mistaken - think execution plan) then communicates that to the spark master which creates task definitions, asks the cluster manager (it's own,yarn, mesos) for resources (worker nodes) and those tasks in turn gets sent to executors (for execution).

Executors communicate back to master certain information and as far as I understand if the driver encounters a problem or crashes, the master will take note and will tell the executor (and it in turn logs) what you see "driver is disassociated". This could be because of a lot of things but the most common ones are because the java process (driver) runs out of memory (try increasing spark.driver.memory)

Some differences when running on Yarn vs Stand-alone vs Mesos but hope this helps. If driver is disassociated, the java process running (as the driver) likely encountered an error - the master logs might have something and not sure if there are driver specific logs. Hopefully someone more knowledgeable than me can provide more info.

来源：https://stackoverflow.com/questions/30317635/sparkexecutor-coarsegrainedexecutorbackend-driver-disassociated-disassociated

标签

apache-spark

rdd