Bulk delete datastore entity older than 2 days

时光毁灭记忆、已成空白 提交于 2019-12-11 17:27:10

问题


I have entity in datastore with fields.

created_date = ndb.DateTimeProperty(auto_now_add=True)

epoch = ndb.IntegerProperty()

sent_requests = ndb.JsonProperty()

I would like bulk to delete all those entities which are older than 2 days using daily cron job. I am aware of ndb.delete_multi(list_of_keys) but how do i get list of keys which are older than 2 days? Is scanning entire datastore with 100+ million entity and getting list of keys where epoch < int(time.time()) - 2*86400 the best option available?


回答1:


Yes, because you only want to delete some of the entities you need to perform (keys_only) queries to obtain the keys to pass to ndb.delete_multi() (or its async version?). Don't worry about the number of entities, all queries are index-based, the response time doesn't depend on how many entities exist in the datastore.

But it may take some time for the index to be updated after the deletions, so use query cursors, not repeated identical queries (which could return keys already deleted).

Also, if you expect to delete a lot of entities, spread the load in multiple requests (for example using the task queue or the deferred library) to prevent exceeding the request deadline. See, for example, How to delete all the entries from google datastore?



来源:https://stackoverflow.com/questions/48360919/bulk-delete-datastore-entity-older-than-2-days

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!