Docker: using container with headless Selenium Chromedriver

喜你入骨 提交于 2021-02-17 20:56:39

问题


I'm trying to link peroumal1's "docker-chrome-selenium" container to another container with scraping code that uses Selenium.

He exposes his container to port 4444 (the default for Selenium), but I'm having trouble accessing it from my scraper container. Here's my docker-compose file:

chromedriver:
  image: eperoumalnaik/docker-chrome-selenium:latest

scraper:
  build: .
  command: python manage.py scrapy crawl general_course_content
  volumes:
    - .:/code
  ports:
    - "8000:8000"
  links:
    - chromedriver

and here's my scraper Dockerfile:

FROM python:2.7

RUN mkdir /code
WORKDIR /code
ADD requirements.txt /code/

RUN pip install --upgrade pip
RUN pip install -r requirements.txt
ADD . /code/

When I try to use Selenium from my code (see below), however, I get the following error message: selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be available in the path. Please look at http://docs.seleniumhq.org/download/#thirdPartyDrivers and read up at http://code.google.com/p/selenium/wiki/ChromeDriver. On Mac OS X, when I wasn't using Docker, I fixed this by downloading the chromedriver binary and adding it to the path, but I don't know what to do here.

driver = webdriver.Chrome()
driver.maximize_window()
driver.get('http://google.com')
driver.close()

Edit: I'm also trying to do this with Selenium's official images and, unfortunately, it's not working either (the same error message asking for the chromedriver binary appears).

Is there something that needs to be done on the Python code?

Thank you!

Update: As @peroumal1 said, the problem was that I wasn't connecting to a remote driver using Selenium. After I did, however, I had connectivity problems (urllib2.URLError: <urlopen error [Errno 111] Connection refused>) until I modified the IP address that the Selenium driver connects to (when using boot2docker, you have to connect to the virtual machine's IP instead of your computer's localhost, which you can find by typing boot2docker ip) and changed the docker-compose file. This is what I ended up with:

chromedriver:
  image: selenium/standalone-chrome
  ports:
    - "4444:4444"

scraper:
  build: .
  command: python manage.py scrapy crawl general_course_content
  volumes:
    - .:/code
  ports:
    - 8000:8000
  links:
    - chromedriver

And the Python code (boot2docker's IP address on my computer is 192.168.59.103):

driver = webdriver.Remote(
           command_executor='http://192.168.59.103:4444/wd/hub',
           desired_capabilities=DesiredCapabilities.CHROME)
driver.maximize_window()
driver.get('http://google.com')
driver.close()

回答1:


I think the issue here might not Docker, but the code. The Selenium images provide a interface to a Selenium Server through remote Webdriver, and the code provided tries to directly instantiate a Chrome browser using chromedriver, a thing that is possible with Selenium Python bindings, provided that chromedriver is accessible from the environment.

Maybe it would work better using the example from the docs :

from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

driver = webdriver.Remote(
command_executor='http://127.0.0.1:4444/wd/hub',
desired_capabilities=DesiredCapabilities.CHROME)


来源:https://stackoverflow.com/questions/29781266/docker-using-container-with-headless-selenium-chromedriver

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!