How to mitigate against long startup times in firebase workers when dataset gets large

最后都变了- 提交于 2019-12-06 21:19:27

Adding on to Frank's answer, here are a couple other possibilities.

Use a queue strategy

Since the workers are really expecting to process one-time events, then give them one-time events which they can pull from a queue and delete after they finish processing. This resolves the multiple-worker scenario elegantly and ensures nothing is ever missed because a server was offline

Utilize a timestamp to reduce backlog

A simple strategy for avoiding backlog during reboot/startup of the workers is to add a timestamp to all of the events and then do something like the following:

var startTime = Date.now() - 3600 // an hour ago
pathRef.orderByChild('timestamp').startAt( startTime );

Keep track of the last id processed

This only works well with push ids, since formats that do not sort naturally by key will likely become out of order at some point in the future.

When processing records, have your worker keep track of the last record it added by writing that value into Firebase. Then one can use orderByKey().startAt( lastKeyProcessed ) to avoid the backlog. Annoyingly, we then have to discard the first key. However, this is an efficient query, does not cost data storage for an index, and is quick to implement.

If you only need to process new comments once, you can put them in a separate list, e.g. newComments vs. comments (the ones that have been processed). The when you're done processing, move them from newComments to comments.

Alternatively you can keep all comments in a single list like you have today and add a field (e.g. "isNew") to it that you set to true initially. Then you can filter with orderByChild('isNew').equalTo(true) and update({ isNew: false }) once you're done with processing.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!