CAP theorem - Availability and Partition Tolerance

后端 未结 9 1727
囚心锁ツ
囚心锁ツ 2020-12-04 04:09

While I try to understand the \"Availability\" (A) and \"Partition tolerance\" (P) in CAP, I found it difficult to understand the explanations from various articles.

<
相关标签:
9条回答
  • 2020-12-04 04:53

    In simple CAP theorem states that its impossible for a distributed system to simultaneously provide all three guarantees:

    Consistency

    Every node contains same data at the same time

    Availability

    At least one node must be available to serve data every time

    Partition tolerance

    Failure of the system is very rare

    Mostly every system can only guarantee minimum two features either CA, AP, or CP.

    0 讨论(0)
  • 2020-12-04 05:00

    Consistency – When we are sending the read request, if it is returning result, it should return the most recent write given by client request. Availability – Your request for read/write should always succeed. Partition tolerance – When there is network partition (problem for some machines to talk with each other) occurs, system should still work.

    In a distributed there are chances that network partition will occur and we cannot avoid “P” of CAP. So we choose between “Consistency” and “Availability”.

    http://bigdatadose.com/understanding-cap-theorem/

    0 讨论(0)
  • 2020-12-04 05:02

    Considering P in equal terms with C and A is a bit of a mistake, rather '2 out of 3' notion among C,A,P is misleading. The succinct way I would explain CAP theorem is, "In a distributed data store, at the time of network partition you have to chose either Consistency or Availability and cannot get both". Newer NoSQL systems are trying to focus on Availability while traditional ACID databases had a higher focus on Consistency.

    You really cannot choose CA, network partition is not something anyone would like to have, it is just an undesirable reality of a distributed system, networks can fail. Question is what trade off do you pick for your application when that happens. This article from the man who first formulated that term seems to explain this very clearly.

    0 讨论(0)
  • 2020-12-04 05:02

    Consistency:

    A read is guaranteed to return the most recent write(like ACID) for a given client. If any request comes during that time it has to wait till data sync completed across/in the node(s).


    Availability:

    every node (if not failed) always executes queries and should always respond to requests. It does not matter whether it returns the latest copy or not.


    Partition-tolerance:

    The system will continue to function when network partitions occur.


    Regarding AP, Availability(always accessible) can exist with(Cassendra) or without(RDBMS) partition tolerance

    pic source

    0 讨论(0)
  • 2020-12-04 05:02

    Simple way to understand CAP theorem:

    In case of network partition, one needs to choose between perfect availability and perfect consistency.

    Picking consistency means not being able to answer a client's query as the system cannot guarantee to return the most recent write. This sacrifices availability.

    Picking availability means being able to respond to a client's request but the system cannot guarantee consistency, i.e., the most recent value written. Available systems provide the best possible answer under the given circumstance.

    This explanation is from this excellent article. Hope it will help.

    0 讨论(0)
  • 2020-12-04 05:03

    I feel partition tolerance is not explained well in any of the answers so just to explain things in some more detail CAP theorem means:

    C: (Linearizability or strong consistency) roughly means

    If operation B started after operation A successfully completed, then operation B must see the system in the same state as it was on completion of operation A, or a newer state (but never old state).

    A:

    “every request received by a non-failing [database] node in the system must result in a [non-error] response”. It’s not sufficient for some node to be able to handle the request: any non-failing node needs to be able to handle it. Many so-called “highly available” (i.e. low downtime) systems actually do not meet this definition of availability.

    P:

    Partition Tolerance (terribly misnamed) basically means that you’re communicating over an asynchronous network that may delay or drop messages. The internet and all our data centres have this property, so you don’t really have any choice in this matter.

    Source: Awesome Martin kleppmann's work

    Just to take some example: Cassandra can at max be AP system. But if you configure it to read or write based on Quorum then it does not remain CAP-available (available as per definition of the CAP theorem) and is only P system.

    0 讨论(0)
提交回复
热议问题