What to do if Cassandra reports failure but did a partial write?

别来无恙 提交于 2021-02-19 05:08:08

问题


Cassandra does not guarantee atomic behavior so there is a slight chance that one replica fails but other replica do persist the change.

Are there any information how to defend against this and what to do in order to heal it if it happens? Does Cassandra heal itself in that regard?

[Update]

I specially focus on the case where you send a write request to lets say all replica and only one replica fails with a write error. The node failing with the write is still alive and operational. According to the Cassandra documentation the write request will return a failure even the two other (if you have a replication factor of 3) succeeded.

According to the documentation in this case two replica has changed and one remains original. There was stated that in this case its a non-consistent state since the other two will not be able to roll back any change written.

So the question goes how can one defend against that.


回答1:


In cassandra a timeout such as this is not considered a failure. See this blog post describing how Cassandra handles different conditions when it comes to writes:

Remember that for writes, a timeout is not a failure.

How can we say that since we don’t know what happened before the replica failed? The coordinator can force the results towards either the pre-update or post-update state. This is what Cassandra does with hinted handoff.

...the coordinator stores the update locally, and will re-send it to the failed replica when it recovers, thus forcing it to the post-update state that the client wanted originally.

So to answer your question, yes cassandra will heal itself using hinted handoff, and when that process fails (i.e. max_hint_window_in_ms exceeded before replica becomes online), a repair should get things into a consistent state. This is one reason why it is recommended to run repairs regularly.

This article explains hinted handoff in more detail.



来源:https://stackoverflow.com/questions/30017301/what-to-do-if-cassandra-reports-failure-but-did-a-partial-write

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!