Capture Heroku SIGTERM in Celery workers to shutdown worker gracefully

时光怂恿深爱的人放手 提交于 2019-12-03 16:27:30

问题


I've done a ton of research on this, and I'm surprised I haven't found a good answer to this yet anywhere.

I'm running a large application on Heroku, and I have certain celery tasks that run for a very long time processing, and at the end of the task save a result. Every time I redeploy on Heroku, it sends SIGTERM (and eventually, SIGKILL) and kills my running worker. I'm trying to find a way for the worker instance to shut itself down gracefully and re-queue itself for processing later so that eventually we can save the required result instead of losing the queued task.

I cannot find a way that works to have the worker listen for SIGTERM properly. The closest I've gotten, which works when running python manage.py celeryd directly but NOT when emulating Heroku using foreman, is the following:

@app.task(bind=True, max_retries=1)
def slow(self, x):
    try:
        for x in range(100):
            print 'x: ' + unicode(x)
            time.sleep(10)
    except exceptions.MaxRetriesExceededError:
        logger.error('whoa')
    except (exceptions.WorkerShutdown, exceptions.WorkerTerminate) as exc:
        logger.error(u'retrying, ' + unicode(exc))
        raise self.retry(exc=exc, countdown=10)
    except (KeyboardInterrupt, SystemExit) as exc:
        print 'retrying'
        raise self.retry(exc=exc, countdown=10)
    else:
        return x
    finally:
        logger.info('task ended!')

When I start this celery task running within foreman and hit Ctrl+C, the following happens:

^CSIGINT received
22:20:59 system   | sending SIGTERM to all processes
22:20:59 web.1    | exited with code 0
22:21:04 system   | sending SIGKILL to all processes
Killed: 9

So it's clear that none of the celery exceptions, nor the KeyboardInterrupt or SystemExit exceptions I've seen in other posts, properly catch SIGTERM and shut down the worker.

What is the right way to do this?


回答1:


celery was unfortunately not designed to do clean shutdown. EVER. I mean it. celery workers respond to SIGTERM but if a task is incomplete, the worker processes will wait to finish the task and only then exit. In which case, you can send it SIGKILL if the workers don't shut down in a reasonable time but there will be a loss of information in this case i.e. you may not know which jobs remained incomplete.




回答2:


You can use acks_late or task_acks_late.

Tasks will be acknowledged from queue after task finished execution and not just before. So task will respawn if worker shutdown gracefully.




回答3:


Starting in version >= 4, Celery comes with a special feature, just for Heroku, that supports this functionality out of the box:

$ REMAP_SIGTERM=SIGQUIT celery -A proj worker -l info

source: https://devcenter.heroku.com/articles/celery-heroku#using-remap_sigterm



来源:https://stackoverflow.com/questions/29872998/capture-heroku-sigterm-in-celery-workers-to-shutdown-worker-gracefully

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!