Avoiding split-brain, votes and quorum [closed]

落花浮王杯 提交于 2019-12-29 03:14:04

问题


Suppose you have n processes, n > 2. You want to have agreement amongst them that one is to be active. So they need to vote amonst each other to determine which one is active.

All processes may fail at any time, we want to have one process active if possible, but ...

We must never have two active at the same time, so if they can't be sure it is better to have no-one active. (Ie. we want to avoid split brain)

The only available communication mechanism between them is pub-sub messaging (not point to point).

One or more databases are available, but no one database should be a single point of failure. Ie. it would be very undesirabloe if all the processes were available to work, and the were prevented from doing so by the loss of a single database.

Design? What messages need to be published?


回答1:


Theory:

This is leader election, which is a form of the Consensus Problem, also sometimes called The Two Generals Problem. Under some sets of assumptions (fully async and messages can be lost) it's been proven impossible, and the proof is particularly elegant.

The intuition of this problem is: imagine some algorithm exists that allows consensus to be reached in some fixed number of messages. Since failures are tolerated, we can drop one message from the protocol, and it should still work. We can repeat this process until there are no messages at all, clearly an impossibility.

In practice we overcome this using failure detectors to simulate a synchronous system.

The most widely known algorithm that solves consensus is Paxos, which can tolerate failure of up to half of the participating nodes. Paxos has the reputation of being very difficult to implement as even slight misunderstandings of the details of the protocol destroy it's correctness.

Practical solutions:

While the problem in general is quite difficult, getting working systems up is far easier. There are off the shelf implementations of Paxos or equivalent algorithms available. Apache Zookeeper is the best I'm aware of. For your specific problem, I'm pretty sure it'll be your quickest route. Other Paxos implementations are around, and it also might be possible to build something on network redundancy virtual ip tools like Wackamole. I believe the high end versions of most commercial databases offer quorum features as an (expensive) option.

Also, for many applications it's acceptable to weaken correctness slightly or otherwise adjust the problem to allow much simpler solutions.

For example, if a single point of failure is tolerable because recovery is likely to be quick, then the problem is trivial: just have one special node do the work.

Another approach might be to build the system around idempotent actions, so duplicate processing becomes tolerable.

Lastly you might partition the workload into a pool of non-redundant systems: here failures will delay processing until recovery but only for items at that node, not for the entire workload.

These sorts of compromises are so much simpler that they're often a better choice. One has to weigh the utility of a full solution against the complexity of implementing it and see if there's really value. This is why so many practical systems just use 2 Phase or 3 Phase Commit, even though they block in some scenarios: the decreased availability is tolerable compared to the complexity of a full quorum system.




回答2:


I'm not clear on pub-sub messaging.

If they're getting some sort of work objects from an outside source and you only want one of them to process the work, you could take a hash value space, 2^64, divide up the space by the number of nodes each node taking a chunk. Each node could hash the work objects as they come in and determine if it's theirs.




回答3:


Take a look at how routing protocols (OSPF and IS-IS) do it, and see if that works for you. They elect a leader (and in the OSPF case, a backup leader).



来源:https://stackoverflow.com/questions/1106984/avoiding-split-brain-votes-and-quorum

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!