High-concurrency counters without sharding

This question concerns two implementations of counters which are intended to scale without sharding (with a tradeoff that they might under-count in some situations):

http://appengine-cookbook.appspot.com/recipe/high-concurrency-counters-without-sharding/ (the code in the comments)
http://blog.notdot.net/2010/04/High-concurrency-counters-without-sharding

My questions:

With respect to #1: Running memcache.decr() in a deferred, transactional task seems like overkill. If memcache.decr() is done outside the transaction, I think the worst-case is the transaction fails and we miss counting whatever we decremented. Am I overlooking some other problem that could occur by doing this?
What are the significiant tradeoffs between the two implementations?

Here are the tradeoffs I see:

2 does not require datastore transactions.
To get the counter's value, #2 requires a datastore fetch while with #1 typically only needs to do a memcache.get() and memcache.add().
When incrementing a counter, both call memcache.incr(). Periodically, #2 adds a task to the task queue while #1 transactionally performs a datastore get and put. #1 also always performs memcache.add() (to test whether it is time to persist the counter to the datastore).

Conclusions

(without actually running any performance tests):

1 should typically be faster at retrieving a counter (#1 memcache vs #2 datastore). Though #1 has to perform an extra memcache.add() too.
However, #2 should be faster when updating counters (#1 datastore get+put vs #2 enqueue a task).
On the other hand, with #1 you have to be a bit more careful with the update interval since the task queue quota is almost 100x smaller than either the datastore or memcahce APIs.

Going to datastore is likely to be more expensive than going through memcache. Else memcache wouldn't be all that useful in the first place :-)

I'd recommend the first option.

If you have a reasonable request rate, you can actually implement it even simpler:

1) update the value in memcache
2) if the returned updated value is evenly divisible by N
2.1) add N to the datastore counter
2.2) decrement memcache by N

This assumes you can set a long enough timeout on your memcache to live between successive events, but if events are so sparse that your memcache times out, chances are you wouldn't need a "high concurrency" counter :-)

For larger sites, relying on a single memcache to do things like count total page hits may get you in trouble; in that case, you really do want to shard your memcaches, and update a random counter instance; the aggregation of counters will happen by the database update.

When using memcache, though, beware that some client APIs will assume that a one second timeout means the value isn't there. If the TCP SYN packet to the memcache instance gets dropped, this means that your request will erroneously assume the data isn't there. (Similar problems can happen with UDP for memcache)

Memcache gets flushed, you lose your counter. OUCH. Using a mysql database or a NOSQL solution will resolve that problem with a possible performance hit. (Redis, Tokyotyrant, MongoDB etc...) may not have that performance hit.

Keep in mind, you may want to do 2 actions:

keep a memcache counter just for the high performance reasons.
keep a log, and then get more accurate metrics from that.

来源：https://stackoverflow.com/questions/2769934/high-concurrency-counters-without-sharding

标签

python

google-app-engine

counter

High-concurrency counters without sharding

My questions:

Here are the tradeoffs I see:

2 does not require datastore transactions.

Conclusions

1 should typically be faster at retrieving a counter (#1 memcache vs #2 datastore). Though #1 has to perform an extra memcache.add() too.

1 should typically be faster at retrieving a counter (#1 memcache vs #2 datastore). Though #1 has to perform an extra `memcache.add()` too.