Using docker, scrapy splash on Heroku

前端 未结 2 543
广开言路
广开言路 2020-12-16 04:23

I have a scrapy spider that uses splash which runs on Docker localhost:8050 to render javascript before scraping. I am trying to run this on heroku but have no idea how to c

2条回答
  •  借酒劲吻你
    2020-12-16 05:15

    Run at the same problem. Finally, I succesfully deployed splash docker image on Heroku. This is my solution: I cloned the splash proyect from github and changed the Dockerfile.

    • Removed command EXPOSE because it's not supported by Heroku
    • Replaced ENTRYPOINT by CMD command.

    CMD python3 /app/bin/splash --proxy-profiles-path /etc/splash/proxy-profiles --js-profiles-path /etc/splash/js-profiles --filters-path /etc/splash/filters --lua-package-path /etc/splash/lua_modules/?.lua --port $PORT

    Notice that I added the option --port=$PORT. This is just to listen at the port specified by Heroku instead of the default (8050)

    A fork to the proyect with this change its avaliable here You just need to build the docker image and push it to the heroku's registry, like you did before. You can test it locally first but you must pass the environment variable "PORT" when running the docker

    sudo docker run -p 80:80 -e PORT=80 mynewsplashimage

提交回复
热议问题