Are ndb cached read ops still counted as datastore read ops for billing purposes?

余生颓废 提交于 2019-12-05 23:35:19

The memcache hit rate includes the hits for entities written by ndb. It does not include the number of in-context cache hits by ndb.

Each write up to datastore invalidates the caches, so the next read will not be cached in memcache or in the in-context cache.

One other thing is that ndb creates a new in-context cache for each transaction, so the in-context cache isn't very effective in the face of transactions.

The quick answer is that a memcache hit for a datastore entity is not charged as a datastore read for billing purposes.

Following Jim's answer I did further digging, sharing my findings here as others may find them useful.

I congratulated myself for deciding from day one to build and use a generic wrapper class for all my datastore entities, further inherited by specific classes for each entity kind. That is instead of operating with model classes directly inheriting ndb.Model.

I already had a db_data @property in this generic class for reading the entity from the datastore on-demand, it was easy to plug in my own entity memcache-based caching scheme and tracking code to determine if the reads come from memcache or from ndb and if they're done inside transactions or not. This is how it looks like:

@property
def db_data(self):
    if not hasattr(self, self.attr_db_data):
        db_data = None
        if self.readonly or not ndb.in_transaction():
            # noinspection PyTypeChecker
            db_data = memcache.get(self.memcache_key)
        if db_data:
            if not isinstance(db_data, self.db_model):
                logging.error('mismatched cached db_data kind, expected %s got type %s %s' %
                              (self.kind, type(db_data), db_data))
                db_data = None
            else:
                if self.trace:
                    logging.debug('%s from memcache' % self.lid)
                self.cached = True
        if not db_data:
            if hasattr(self, self.attr_db_key):
                db_data = self.db_key.get()
            elif hasattr(self, self.attr_key_id):
                db_data = self.db_model.get_by_id(id=self.key_id)
            else:
                raise RuntimeError('missing db_data, db_key and key_id')
            if db_data:
                # use ndb model names as strings in the list below. TODO: don't leave them on!
                show_trace = self.kind in ['']
                if self.trace or show_trace:
                    if show_trace:
                        self.print_stack(4)
                    logging.debug('%s from datastore' % self.lid)
                self.cached = False
                # noinspection PyTypeChecker
                memcache.set(self.memcache_key, db_data)
        setattr(self, self.attr_db_data, db_data)
    return getattr(self, self.attr_db_data)

With this in place I uncovered:

  • a lot of unnecessary reads from ndb, most notably the sequence configuration entities which did not change at all during the entire sequence and were accessed hundreds of times
  • some reads for the same entities repeated inside the same request
  • most important: the details for my many transactions failing with TransactionFailedError(The transaction could not be committed. Please try again.) or TransactionFailedError: too much contention on these datastore entities. please try again, caused by all these unnecessary reads from many transactions executed in parallel. I knew (or rather suspected) the root cause - see related Contention problems in Google App Engine - but with these details info I can now actually work towards reducing them.

For the write side, I added to the generic class a db_data_put() method (with many checks and tracking support) and replaced all the .db_data.put() calls scattered all over the app code with it. This is how it looks like:

def db_data_put(self, force=False):
    #assert ndb.in_transaction()  TODO: re-enable after clearing all violations

    self.db_key = self.db_data.put()

    show_trace = False
    if not ndb.in_transaction():
        logging.error('%s: db_data_put called outside transaction, readonly=%s' % (self.lid, self.readonly))
        show_trace = True
    if self.readonly:
        logging.error('%s: db_data_put called with readonly set' % self.lid)
        show_trace = True
    if force:
        if self.trace:
            logging.warn('%s: db_data_put called with force arg set' % self.lid)
    if self.cached:
        logging.error('%s: db_data_put called with cached data' % self.lid)
        show_trace = True
    if self.put_count:
        logging.warn('%s: db_data_put already called %d time(s)' % (self.lid, self.put_count))
        show_trace = True
    self.put_count += 1

    if  self.update_needed:
        self.update_needed = False
    elif not force:
        if self.trace:
            logging.warn('%s: db_data_put called without update_needed set' % self.lid)
            show_trace = True

    # noinspection PyTypeChecker
    memcache.set(self.memcache_key, self.db_data)

    # use ndb model names as strings in the list below. TODO: don't leave them on!
    show_kind = self.kind in ['']

    if show_trace or show_kind:
        self.print_stack(4)

    if self.trace or show_trace or show_kind:
        logging.debug('%s: db_data_put %s' % (self.lid, self.kind))

With it I uncovered some room for improvements and some lurking bugs:

  • entities written outside transactions (room for data corruption)
  • entities written multiple times in response to the same request (thus violating the max 1 write/second/entity group datastore limit)
  • several hot requests occasionally clustering very close to each-other, thus contributing to the contention problem and, when writing the same entities, also violating the 1 write/sec limit.

After setting the self.readonly flag to all sequence config objects and a couple of more often unnecessarily read ones (to enable caching even inside transactions), serializing the hot writing requests and fixed the most critical bugs found I re-tried the clean measurement test:

  • the entire sequence execution time dropped to ~13 minutes, probably from serializing the hot transactions which helped reduce the "noise" of the sequence and the overall number of logic state transitions and push queue tasks
  • the number of data contention-related transaction failures dropped by ~60% - I attribute most of this to my private caching of entities not expected to change during a transaction (ndb has no clue if an entity accessed in a transaction will eventually be written to or not) and to serializing of the hot transactions
  • the Cloud Datastore Entity Writes dropped to 0.0055 Million Ops - probably due to the fewer sequence state transitions mentioned above
  • the Cloud Datastore Read Operations dropped to 0.01 Million Ops - my caching helps during transactions

The rate of write to reads actually increased, invalidating my assumption that they should be comparable - the fact that most of the sequence activity happens in transactions matters in this case. I guess my assumption would only be valid in a non-transactional context.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!