问题
I have to add one new property to my existing NDB class:
class AppList(ndb.Model):
...
ignore = ndb.BooleanProperty(default=False) # new property
Then I will use it like below:
entries = AppList.query()
entries = entries.filter(AppList.counter > 5)
entries = entries.filter(AppList.ignore == False)
entries = entries.fetch()
I can not use AppList.ignore != True
to catch early added records (which don't have ignore
property), so I have to assign False
for all records in my AppList
entity. What is the most effective way to do it? Currently this entity contains about 4'000 entries.
Upd. I've decided to use the following ugly code (didn't manage to apply cursors
), it runs as a cron job. But don't I update the same 100 records each time?
entities = AppList.query()
# entities = entities.filter(AppList.ignore != False)
entities = entities.fetch(100)
while entities:
for entity in entities:
entity.ignore = False
entity.put()
entities = AppList.query()
# entities = entities.filter(AppList.ignore != False)
entities = entities.fetch(100)
回答1:
Don't forget that there is a MapReduce library that is used in these cases. But I think the best method is to use all these suggestions toghether.
Now, you need to get() and put() 4000 entities and the question is how to reduce the "costs" of this operation.
I'm just curious to know what your bool(entity.ignore)
returns. If a missing
property return False you can adjust the code considering it False and postponed the operation. If you put() for other reason the property ignore
is written to False
thanks to the default argument. So, for the rest of the entities can run a script like this (via remote_api):
def iter_entities(cursor=None):
entries = AppList.query()
res, cur, more = entries.fetch_page(100, start_cursor=cursor)
put_queue = [ent for ent in res if not hasattr(ent, 'ignore')]
# put_queue = []
# for ent in res:
# if not hasattr(ent, 'ignore'):
# put_queue.append(ent)
ndb.put_multi(put_queue)
if more:
iter_entities(cur) # a taskqueue is better
回答2:
Your updated code will update first 100 entities only. try using cursor
https://developers.google.com/appengine/docs/python/ndb/queries#cursors
if u cant use cursor then use offset and keep increasing the offset by 100 on every loop or fetch all the entries once by fetch() (cursor approach is better one)
and instead of putting them one by one use ndb.put_multi(list of entities to put)
this will be more faster than putting one by one
回答3:
You can try using hasattr to check to see if a record has the ignore property.
If you just want to assign False for all records in your AppList entity, you just need to do an update to your schema (reload the models.py file) and then you should be able to set the property to False.
More information on a schema update can be found here.
EDIT: to answer your comment:
if hasattr(entity, 'ignore'):
#your code goes here
回答4:
The simplest way without deploying new code, will be to use the remote api and perform a query fetching all entities and setting the property value to false and and then put() them. 4000 records is not a lot.
In fact you don't even need to explicitly set the value, it will be set when retrieved to the default value if it has no value currently.
来源:https://stackoverflow.com/questions/19845425/how-to-assign-default-value-to-all-ndb-datastore-entries