Question
How can I solve the ConnectionError: Too many heartbeats missed
from Celery?
Example Error
[2013-02-11 15:15:38,513: ERROR/MainProcess] Error in timer: ConnectionError('Too many heartbeats missed', None, None, None, '') Traceback (most recent call last): File "/app/.heroku/python/lib/python2.7/site-packages/celery/utils/timer2.py", line 97, in apply_entry entry() File "/app/.heroku/python/lib/python2.7/site-packages/celery/utils/timer2.py", line 51, in __call__ return self.fun(*self.args, **self.kwargs) File "/app/.heroku/python/lib/python2.7/site-packages/celery/utils/timer2.py", line 153, in _reschedules return fun(*args, **kwargs) File "/app/.heroku/python/lib/python2.7/site-packages/kombu/connection.py", line 265, in heartbeat_check return self.transport.heartbeat_check(self.connection, rate=rate) File "/app/.heroku/python/lib/python2.7/site-packages/kombu/transport/pyamqp.py", line 134, in heartbeat_check return connection.heartbeat_tick(rate=rate) File "/app/.heroku/python/lib/python2.7/site-packages/amqp/connection.py", line 837, in heartbeat_tick raise ConnectionError('Too many heartbeats missed') ConnectionError: Too many heartbeats missed
App Overview
- Django app using celery for periodic background tasks
- Hosted on Heroku
- Single task scheduled to run every 15 minutes via settings / celerybeat
- Messaging handled via CloudAMQP add-on
- Processes run by
web: newrelic-admin run-program gunicorn --workers=2 --worker-class=gevent someapp.wsgi:application
scheduler: newrelic-admin run-program python manage.py celery worker -B -E --maxtasksperchild=1000 --loglevel=WARNING
Package Versions
Just what I think are relevant:
Django==1.4.3 amqp==1.0.8 billiard==2.7.3.20 celery==3.0.14 gevent==0.13.8 greenlet==0.4.0 kombu==2.5.6 raven==3.1.10
What I've Tried So Far
- Correlating error with activities (doesn't seem to correlate with user's visiting app, background tasks being called, app idling)
- Googling / searching SO until my fingers were numb
- Upgrading packages to latest versions
- Various levels of logging
- Exception capturing with Sentry (doesn't appear in sentry)
- Cannot reproduce error locally under development environment, only in production on Heroku
Possible Relevant Info
- I'm not sure exactly when this error first appeared (~ one month ago?)
- May be related in some way to the following changes (don't recall error before this, not 100% sure though)
celery==3.0.13
tocelery==3.0.14
amqplib
->amqp
kombu==2.4.8
tokombu==2.5.4
- Error only appears in logs (doesn't get picked up by New Relic or getsentry.com)