What happens when Spark master fails?

前端 未结 3 1979
情歌与酒
情歌与酒 2020-12-15 23:26

Does the driver need constant access to the master node? Or is it only required to get initial resource allocation? What happens if master is not available after Spark cont

3条回答
  •  长情又很酷
    2020-12-15 23:59

    The first and probably the most serious for the time being consequence of a master failure or a network partition is that your cluster won't be able to accept new applications. This is why Master is considered to be a single point of failure when cluster is used with default configuration.

    Master loss will be acknowledged by the running applications but otherwise these should continue to work more or less like nothing happened with two important exceptions:

    • application won't be able to finish gracefully
    • if master is down, or network partition affects worker nodes as well, slaves will try to reregisterWithMaster. If this fails multiple times workers will simply give up. At this moment long running applications (like streaming apps) won't be able to continue processing but it still shouldn't result in immediate failure. Instead application will wait for a master to go back on-line (file system recovery) or a contact from a new leader (Zookeeper mode), and if that happens it will continue processing.

提交回复
热议问题