ndb Models are not saved in memcache when using MapReduce

蓝咒 提交于 2019-12-06 12:39:14
Scotty

MapReduce intentionally disables memcache for NDB.

See mapreduce/util.py ln 373, _set_ndb_cache_policy() (as of 2015-05-01):

def _set_ndb_cache_policy():
  """Tell NDB to never cache anything in memcache or in-process.

  This ensures that entities fetched from Datastore input_readers via NDB
  will not bloat up the request memory size and Datastore Puts will avoid
  doing calls to memcache. Without this you get soft memory limit exits,
  which hurts overall throughput.
  """
  ndb_ctx = ndb.get_context()
  ndb_ctx.set_cache_policy(lambda key: False)
  ndb_ctx.set_memcache_policy(lambda key: False)

You can force get_by_id() and put() to use memcache, eg:

product = Product.get_by_id(p_id, use_memcache=True)
...
product.put(use_memcache=True)

Alternatively, you can modify the NDB context if you are batching puts together with mapreduce.operation. However I don't know enough to say whether this has other undesired effects:

ndb_ctx = ndb.get_context()
ndb_ctx.set_memcache_policy(lambda key: True)
...
yield operation.db.Put(product)

As for the docstring about "soft memory limit exits", I don't understand why that would occur if only memcache was enabled (ie. no in-context cache).

It actually seems like you want memcache to be enabled for puts, otherwise your app ends up reading stale data from NDB's memcache after your mapper has modified the data underneath.

As Slawek Rewaj already mentioned this is caused by the in-context cache. When retrieving an entity NDB tries the in-context cache first, then memcache, and finally it retrieves the entity from datastore if it wasn't found neither in the in-context cache nor memcache. The in-context cache is just a Python dictionary and its lifetime and visibility is limited to the current request, but MapReduce does multiple calls to product_bulk_import_map() within a single request.

You can find more information about the in-context cache here: https://cloud.google.com/appengine/docs/python/ndb/cache#incontext

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!