phantomjs | 易学教程

Error: Cannot find module 'libxmljs'

阅读更多关于 Error: Cannot find module 'libxmljs'

问题 I am trying to parse xml using phantomjs for the following file, documentpreviewer1.js var webPage = require('webpage'); var page = webPage.create(); var url = "http://xxx/sitemap.xml"; page.open(url, function(status){ if(status != 'success'){ console.log('Unable to access cfc'); } else { var xml = page.content; var libxmljs = require("libxmljs"); var xmlDoc = libxmljs.parseXml(xml); var url1 = xmlDoc.get('//urlset/url[0]/loc'); console.log(url1); } }); when I run the above code, I get the

How to make PhantomJS wait for a specific condition on the page

阅读更多关于 How to make PhantomJS wait for a specific condition on the page

问题 I want to realize something like this: function(){ var ua = page.evaluate(function() { while (document.getElementById("td_details87").parentElement.children[11].textContent == "Running" ) { console.log ("running....."); sleep(10000); //10 seconds }; console.log ("DONE"); }); }, How can I realize the sleep function and is there a while loop? 回答1: There is no blocking sleep() function in JavaScript. If you want to sleep then you have to use some asynchronous function like setTimeout(callback,

setting PhantomJSDriverService.PHANTOMJS_GHOSTDRIVER_PATH_PROPERTY

阅读更多关于 setting PhantomJSDriverService.PHANTOMJS_GHOSTDRIVER_PATH_PROPERTY

问题 I have difficulties setting the capability PhantomJSDriverService.PHANTOMJS_GHOSTDRIVER_PATH_PROPERTY in my Java program correctly in order to use the newest version of Ghostdriver from github together with my installed phantomjs version (1.9.1) Here is what I do in my Java program DesiredCapabilities caps = DesiredCapabilities.phantomjs(); caps.setCapability( PhantomJSDriverService.PHANTOMJS_EXECUTABLE_PATH_PROPERTY, "/xxx/phantomjs-1.9.1-linux-x86_64/bin/phantomjs" ); caps.setCapability(

Selenium webdriver + PhantomJS processes not closing

阅读更多关于 Selenium webdriver + PhantomJS processes not closing

问题 Here's just about the simplest open and close you can do with webdriver and phantom: from selenium import webdriver crawler = webdriver.PhantomJS() crawler.set_window_size(1024,768) crawler.get('https://www.google.com/') crawler.quit() On windows (7), every time I run my code to test something out, new instances of the conhost.exe and phantomjs.exe processes begin and never quit. Am I doing something stupid here? I figured the processes would quit when the crawler.quit() did... 回答1: Go figure

Phantomjs with R

阅读更多关于 Phantomjs with R

问题 I am trying to scrape data from a web page. Since the page has a dynamic content, I used phantomjs to handle. But, with the codes I am using, I just can download the data seen on the web page. However, I need to input the date range and then submit to get all the data I want. Here are the codes i used, library(xml2) library(rvest) connection <- "pr.js" writeLines(sprintf("var page=require('webpage').create(); var fs = require('fs'); page.open('%s',function(){ console.log(page.content);//page

Selenium ends randomly with uncaught error

阅读更多关于 Selenium ends randomly with uncaught error

问题 I'm using mocha, webdriverio, phantomjs Trying to find a way why Selenium is failing in random cases (50% its ok, 50% breaks in different tests with the same code). The error is Uncaught RuntimeError (UnknownError:13) An unknown server-side error occurred while processing the command. Problem: POST /session//url Build info: version: '2.42.0', revision: '5e82430', time: '2014-05-22 19:00:03' System info: host: 'example.com', ip: '127.0.0.1', os.name: 'Linux', os.arch: 'amd64', os.version: '2.6

Trouble Parsing Text using BeautifulSoup and Python

阅读更多关于 Trouble Parsing Text using BeautifulSoup and Python

问题 I am trying to retrieve the comment section on regulations.gov pages. An example is the paragraph "Restrictions on Proprietary Trading... with free market driven valuations." on http://www.regulations.gov/#!documentDetail;D=OCC-2011-0014-0032. I am using BeautifulSoup and Python and have the following code: from bs4 import BeautifulSoup from selenium import webdriver driver = webdriver.PhantomJS() driver.get(http://www.regulations.gov/#!documentDetail;D=OCC-2011-0014-0032) source = driver

Permission denied error with phantomjs

阅读更多关于 Permission denied error with phantomjs

问题 I am using Rails 4.0.2, Guard 2.2.4, guard-rspec 4.2.4, rspec-rails 2.14.0, Capybara 2.2.1 and Poltergeist 1.5.0 on Ruby 2.0.0-p353 and OSX Mavericks. When I run bundle exec guard I got a lot of failure with this error message : An error occurred in an after hook Errno::EACCES: Permission denied - /usr/local/Cellar/phantomjs occurred at /Users/gillesmath/.rvm/rubies/ruby-2.0.0-p353/lib/ruby/2.0.0/open3.rb:211:in `spawn' I checked the permission on /usr/local/Cellar/phantomjs and didn't notice

Faking the Referer Header in PhantomJS is doesn't work

阅读更多关于 Faking the Referer Header in PhantomJS is doesn't work

问题 I want to make my code fake the refferer header in analytic systems(such as google analytics) , but It doesn't work. I have add 'var settings ={...//...}' and add 'page.onLoadStarted = function() {page.customHeaders = {};' and add - 'page.open(...,settings, ...' , but it still recognised like direct traffic in the analytics. Here is the code: var page = require('webpage').create(); var settings = { headers: { "Referer": "http://google.com" } }; var urls = ['http://china.com/','http://usa.com/

bitbucket rate limiting phantomjs

阅读更多关于 bitbucket rate limiting phantomjs

问题 My CI builds keep failing with: > phantomjs@1.9.7-15 install /home/travis/build/redgeoff/paste-image/node_modules/mocha-phantomjs/node_modules/phantomjs > node install.js PhantomJS detected, but wrong version 1.9.8 @ /usr/local/phantomjs/bin/phantomjs. Downloading https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-1.9.7-linux-x86_64.tar.bz2 Saving to /tmp/phantomjs/phantomjs-1.9.7-linux-x86_64.tar.bz2 Receiving... Error requesting archive. Status: 403 Request options: { "uri": "https:/