How do I detect unexpected worker role failures and reprocess data in those cases?

后端 未结 4 1387
梦谈多话
梦谈多话 2021-01-21 23:01

I want to create a web service hosted in Windows Azure. The clients will upload files for processing, the cloud will process those files, produce resulting files, the client wil

4条回答
  •  庸人自扰
    2021-01-21 23:36

    The main issue you have is that queues cannot set a visibility timeout larger than 2 hrs today. So, you need another mechanism to indicate that active work is in progress. I would suggest a blob lease. For every file you process, you either lease the blob itself or a 0-byte marker blob. Your workers scan the available blobs and attempt to lease them. If they get the lease, it means it is not being processed and they go ahead and process. If they fail the lease, another worker must actively be working on it.

    Once the worker has completed processing the file, it simply copies the file into another container in blob storage (or deletes it if you wish) so that it is not scanned again.

    Leases are really your only answer here until queue messages can be renewed.

    edit: I should clarify that the reason that leases would work here is that a lease must be actively maintained every 30 seconds or so, so you have a very small window where you know if someone has died or is still working on it.

提交回复
热议问题