python-rq worker closes automatically

六眼飞鱼酱① 提交于 2020-01-06 20:19:31

问题


I am implementing python-rq to pass domains in a queue and scrape it using Beautiful Soup. So i am running multiple workers to get the job done. I started 22 workers as of now, and all the 22 workers is registered in the rq dashboard. But after some time the worker stops by itself and is not getting displayed in dashboard. But in webmin, it displays all workers as running. The speed of crawling has also decreased i.e. the workers are not running. I tried running the worker using supervisor and nohup. In both the cases the workers stops by itself.

What is the reason for this? Why does workers stops by itself? And how many workers can we start in a single server?

Along with that, whenever a worker is unregistered from the rq dashboard, the failed count increases. I don't understand why?

Please help me with this. Thank You


回答1:


Okay I figured out the problem. It was because of worker timeout.

try:
  --my code goes here--
except Exception, ex:
  self.error += 1
  with open("error.txt", "a") as myfile:
     myfile.write('\n%s' % sys.exc_info()[0] + "{}".format(self.url))
  pass

So according to my code, the next domain is dequeued if 200 url(s) is fetched from each domain. But for some domains there were insufficient number of urls for the condition to terminate (like only 1 or 2 urls).

Since the code catches all the exception and appends to error.txt file. Even the rq timeout exception rq.timeouts.JobTimeoutException was caught and was appended to the file. Thus making the worker to wait for x amount of time, which leads to termination of the worker.



来源:https://stackoverflow.com/questions/37982703/python-rq-worker-closes-automatically

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!