Using docker, scrapy splash on Heroku

前端 未结 2 542
广开言路
广开言路 2020-12-16 04:23

I have a scrapy spider that uses splash which runs on Docker localhost:8050 to render javascript before scraping. I am trying to run this on heroku but have no idea how to c

2条回答
  •  [愿得一人]
    2020-12-16 05:27

    From what I gather you're expecting:

    • Splash instance running on Heroku via Docker container
    • Your web application (Scrapy spider) running in a Heroku dyno

    Splash instance

    • Ensure you can have docker CLI and heroku CLI installed
    • As seen in Heroku's Container Registry - Pushing existing image(s):
      • Ensure docker CLI and heroku CLI are installed
      • heroku container:login
      • docker tag scrapinghub/splash registry.heroku.com//web
      • docker push registry.heroku.com//web
      • To test the application: heroku open -a . This should allow you to see the Splash UI at port 8050 on the Heroku host for this app name.
        • You may need to ensure $PORT is set appropriately as the EXPOSE docker configuration is not respected (https://devcenter.heroku.com/articles/container-registry-and-runtime#dockerfile-commands-and-runtime)

    Running Dyno Scrapy Web App

    • Configure your application to point to :8050. And the Scrapy spider should now be able to request to the Splash instance previously run.

提交回复
热议问题