Scrapy + Splash (Docker) Issue

你说的曾经没有我的故事 提交于 2019-12-05 08:00:07

问题


I have scrapy and scrapy-splash set up on a AWS Ubuntu server. It works fine for a while, but after a few hours I'll start getting error messages like this;

Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.5/site-
packages/twisted/internet/defer.py", line 1384, in _inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
  File "/home/ubuntu/.local/lib/python3.5/site-
packages/twisted/python/failure.py", line 393, in throwExceptionIntoGe
nerator
     return g.throw(self.type, self.value, self.tb)
   File "/home/ubuntu/.local/lib/python3.5/site-
 packages/scrapy/core/downloader/middleware.py", line 43, in process_re
quest
defer.returnValue((yield download_func(request=request,spider=spider)))
twisted.internet.error.ConnectionRefusedError: Connection was refused by 
other side: 111: Connection refused.

I'll find that the splash process in docker has either terminated, or is unresponsive.

I've been running the splash process with;

sudo docker run -p 8050:8050 scrapinghub/splash

as per the scrapy-splash instructions.

I tried starting the process in a tmux shell to make sure the ssh connection is not interfering with the splah process, but no luck.

Thoughts?


回答1:


You should run the container with --restart and -d options. See the documentation how to run Splash in production.



来源:https://stackoverflow.com/questions/45450544/scrapy-splash-docker-issue

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!