celery missed heartbeat (on_node_lost)

孤街浪徒 提交于 2019-12-20 10:16:23

问题


I just upgraded to celery 3.1 and now I see this i my logs ::

on_node_lost - INFO - missed heartbeat from celery@queue_name for every queue/worker in my cluster.

According to the docs BROKER_HEARTBEAT is off by default and I haven't configured it.

Should I explicitly set BROKER_HEARTBEAT=0 or is there something else that I should be checking?


回答1:


Saw the same thing, and noticed a couple of things in the log files.

1) There were messages about time drift at the start of the log and occasional missed heartbeats.

2) At the end of the log file, the drift messages went away and only the missed heartbeat messages were present.

3) There were no changes to the system when the drift messages went away... They just stopped showing up.

I figured that the drift itself was likely the problem itself.

After syncing the time on all the servers involved these messages went away. For ubuntu, run ntpdate as a cron or ntpd.




回答2:


Celery 3.1 added in the new mingle and gossip procedures. I too was getting a ton of missed heartbeats and passing --without-gossip to my workers cleared it up.

http://docs.celeryproject.org/en/latest/whatsnew-3.1.html#mingle-worker-synchronization http://docs.celeryproject.org/en/latest/whatsnew-3.1.html#gossip-worker-worker-communication




回答3:


I'm having a similar issue. I have found the reason in my case.

I have two server to run worker.

when I use "ping" to another server, I found when the ping time larger than 2 second, the log will show " missed heartbeat from celery@ ". The default heartbeat interval is 2 second.

The reason is my poor network. http://docs.celeryproject.org/en/latest/internals/reference/celery.worker.heartbeat.html



来源:https://stackoverflow.com/questions/21132240/celery-missed-heartbeat-on-node-lost

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!