I\'m trying to reduce the execution time of an AppEngine query by running multiple sub-queries asynchronously, using query.fetch_async(). However, it seems that the gain is
Are you always running run_parallel before run_serial? If so ndb caches the results and is able to pull the information much faster. Try flipping the results or even better try with DB, as ndb is just a wrapper to include memcache results.