How can I host a backend service powered by web scraping using selenium web driver?

时光怂恿深爱的人放手 提交于 2021-01-28 18:15:54

问题


So I am developing a project to scrape a website and deliver data to users, however I am using selenium/selenium web driver with python/flask. I was originally going to use beautifulsoup, but the website I am scraping requires some interactions on the page.

I have everything working with the scraper, I am just trying to figure out a way to make this work universally if I wanted to host this service on a website using a service such as heroku.

Currently Selenium is opening a chrome browser and scraping through the pages that way. Is there a smart way to do this without opening a browser and that will work seamlessly when hosted using some service?


回答1:


You can use the "--headless" argument for your driver.

The argument will do the job but without opening an actual browser.

example:

chrome_options = Options()
chrome_options.add_argument("--headless")

driver = webdriver.Chrome(r"drivers/chromedriver.exe",options=chrome_options)



回答2:


Check this tutorial out: https://youtu.be/Ven-pqwk3ec. It shows how to set selenium up on Heroku. For your backend, you may want to run the script at a specific time using celery. Or if you want it to be triggered by user interaction you can also use celery for this async task



来源:https://stackoverflow.com/questions/61571485/how-can-i-host-a-backend-service-powered-by-web-scraping-using-selenium-web-driv

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!