python selenium in openshift server

白昼怎懂夜的黑 提交于 2019-12-11 13:14:19

问题


I need to scrapping some data by python, because of some java codes in target page i could not work with twill and mechanize module of python so i need to run selenium module in my openshift server. so i want to know have could i setup selenium driver (Firefox , chrome ,...) in an openshift server via ssh . i installed selenium by :

$pip install selenium but have could i run :

when i run this:

from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys

browser = webdriver.Firefox() 
browser.get("http://www.yahoo.com") 

i get this error:

    >>> browser = webdriver.Firefox()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/var/lib/openshift/53fa54a8e0b8cd1d3c000611/app-root/runtime/srv/python/
lib/python2.7/site-packages/selenium/webdriver/firefox/webdriver.py", line 49, i
n __init__
    self.binary = FirefoxBinary()
  File "/var/lib/openshift/53fa54a8e0b8cd1d3c000611/app-root/runtime/srv/python/
lib/python2.7/site-packages/selenium/webdriver/firefox/firefox_binary.py", line
43, in __init__
    self._start_cmd = self._get_firefox_start_cmd()
  File "/var/lib/openshift/53fa54a8e0b8cd1d3c000611/app-root/runtime/srv/python/
lib/python2.7/site-packages/selenium/webdriver/firefox/firefox_binary.py", line
162, in _get_firefox_start_cmd
    " Please specify the firefox binary location or install firefox")
RuntimeError: Could not find firefox in your system PATH. Please specify the fir
efox binary location or install firefox
>>> browser.get("http://www.yahoo.com")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'browser' is not defined
>>>

so i think i must install firefox in my server so have i do this?

sudo apt-get install firefox xvfb is not working in openshift servers!. so i edited installation instruction form (http://joelinoff.com/blog/?p=853) and make this code :

#!/bin/sh
# Change this to the last working Libs (may be you have to try and error)

if [ ! -z $OPENSHIFT_DIY_LOG_DIR ]; then
    echo "$OPENSHIFT_LOG_DIR" > "$OPENSHIFT_HOMEDIR/.env/OPENSHIFT_DIY_LOG_DIR"

    nohup   OPENSHIFT_DIY_LOG_DIR2=${OPENSHIFT_LOG_DIR}   > /dev/null 2>&1
    echo $OPENSHIFT_DIY_LOG_DIR2
fi
# ========================================================
# Step 1. Download the archives.
# 1. firefox 15.0.1
# 2. java jre-7u7
# 3. flash 11.2
# ========================================================

mkdir $OPENSHIFT_HOMEDIR/app-root/runtime/srv
mkdir $OPENSHIFT_HOMEDIR/app-root/runtime/srv/firefox
firefox_dir=$OPENSHIFT_HOMEDIR/app-root/runtime/srv/firefox
mkdir $OPENSHIFT_HOMEDIR/app-root/runtime/tmp/
if [ ! -d "$OPENSHIFT_HOMEDIR/app-root/runtime/srv/siege/bin" ]; then
    cd $OPENSHIFT_HOMEDIR/app-root/runtime/srv/firefox
    mkdir repo
    pushd repo
    wget http://releases.mozilla.org/pub/mozilla.org/firefox/releases/15.0.1/linux-x86_64/en-US/firefox-15.0.1.tar.bz2
    wget javadl.sun.com/webapps/download/AutoDL?BundleId=68236 -O jre-7u7-linux-x64.tar.gz
    #wget http://fpdownload.macromedia.com/get/flashplayer/pdc/11.2.202.238/install_flash_player_11_linux.x86_64.tar.gz
    wget ftp://priede.bf.lu.lv/pub/MultiVide/MacroMedia/x64/install_flash_player_11_linux.x86_64.tar.gz
    popd
    # ========================================================
    # Step 2. Install in the rtf (release-to-field) directory.
    # ========================================================
    mkdir rtf
    pushd rtf
    tar jxf ../repo/firefox-15.0.1.tar.bz2
    tar zxf ../repo/jre-7u7-linux-x64.tar.gz
    mkdir -p firefox/plugins
    pushd firefox/plugins
    tar zxf ${firefox_dir}/repo/install_flash_player_11_linux.x86_64.tar.gz

    # This installs the java plugin.
    ln -s ${firefox_dir}/rtf/jre1.7.0_07/lib/amd64/libnpjp2.so .
    popd
    popd
    # ========================================================
    # Step 3. Create a run script.
    # ========================================================
cat >rtf/run.sh <<EOF
#!/bin/bash
MYARGS="\$*"
export PATH="${firefox_dir}/rtf/firefox:$rtfdir/jre1.7.0_07/bin:\${PATH}"
export CLASSPATH="${firefox_dir}/rtf/jre1.7.0_07/lib:\${CLASSPATH}"
firefox \$MYARGS
EOF
    chmod a+x rtf/run.sh

    # ========================================================
    # Now you can run it as shown below.
    # I added flash and java test URLs to make sure that it
    # was working.
    # ========================================================


fi

echo "*****************************"
echo "***         USAGE         ***"
echo "{firefox_dir}/rtf/run.sh http://www.adobe.com/software/flash/about/ http://javatester.org/"
${firefox_dir}/rtf/run.sh http://www.adobe.com/software/flash/about/ http://javatester.org/
echo "*****************************"


echo "*****************************"
echo "***  F I N I S H E D !!   ***"
echo "*****************************"

but when i run it by:

${firefox_dir}/rtf/run.sh http://www.adobe.com/software/flash/about/ http://javatester.org/

i get error :

Error: no display specified

so what i must to do !?

Thanks a lot .


回答1:


I have not used sellinum but I have successfully used phantomjs and casperjs on my OPENSHIFT app for web scrapping. Phantomjs is a true headless browser. Using casperjs on top of it makes it easy. They have good docs also.



来源:https://stackoverflow.com/questions/25910106/python-selenium-in-openshift-server

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!