Event de-duplication using Cassandra

五迷三道 提交于 2019-12-05 16:20:57

IF NOT EXISTS does not scale as well as stock Cassandra (because coordination is slow, but you know that), but is probably the "official, right" way to do it. There are two other methods that "work":

1) Use an external locking system (zookeeper, memcached CAS, etc), that allows you to handle the coordination OUTSIDE of cassandra.

2) Use an ugly hack of an inverted timestamp trick so that first write wins. Rather than using a client supplied timestamp that corresponds to actual wall time, use MAX_LONG - (wall time) = timestamp. That way, the first write has the highest "timestamp", and will take precedence of subsequent writes. This method works, though it plays havok with things like DTCS (if you're doing time series and want to use DTCS, don't use this method, DTCS will be horribly confused) and deletion in general (if you ever want to ACTUALLY DELETE a row with a REAL tombstone, you'll have to write that tombstone with an artificial timestamp as well.

It's worth noting that there have been attempts to address the 'last-write-always-wins' nature of cassandra - for example, CASSANDRA-6412 (which I had working at one point, and will likely pick up again in the next month or so).

Might be diverting here but have you tried distributed redis locks http://redis.io/topics/distlock with sharding based on event_id using Twemproxy as a proxy for redis, if your loads are too high.

I think that from all proposed solutions your second one is the best. But instead storing only the oldest value by clustered column I would store all events to keep it history ordered from oldest to newest ( when inserting you don't have to check if already exists and is oldest etc, then you can select the one with the oldest writetime attribute ). Then I would select the oldest for processing as you wrote. Since cassandra see no difference between insert or upsert I don't see any alternatives to do it with cassandra or as someone said - do this outside.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!