问题
So I am developing a project to scrape a website and deliver data to users, however I am using selenium/selenium web driver with python/flask. I was originally going to use beautifulsoup, but the website I am scraping requires some interactions on the page.
I have everything working with the scraper, I am just trying to figure out a way to make this work universally if I wanted to host this service on a website using a service such as heroku.
Currently Selenium is opening a chrome browser and scraping through the pages that way. Is there a smart way to do this without opening a browser and that will work seamlessly when hosted using some service?
回答1:
You can use the "--headless" argument for your driver.
The argument will do the job but without opening an actual browser.
example:
chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(r"drivers/chromedriver.exe",options=chrome_options)
回答2:
Check this tutorial out: https://youtu.be/Ven-pqwk3ec. It shows how to set selenium up on Heroku. For your backend, you may want to run the script at a specific time using celery. Or if you want it to be triggered by user interaction you can also use celery for this async task
来源:https://stackoverflow.com/questions/61571485/how-can-i-host-a-backend-service-powered-by-web-scraping-using-selenium-web-driv