phantomjs

How to get Google's Knowledge Graph “people also search for” content?

独自空忆成欢 提交于 2019-12-07 00:46:29
I'm trying to get Google's "People also search for" content on the search results page and I'm using PhantomJS to scrape their results. However, that Knowledgebase part I need does not show up in the body I get. Does anyone know what I could do to have it shown to me? Here's the code: var phantom = require('phantom'); phantom.create(function (ph) { ph.createPage(function (page) { page.open("http://www.google.com/ncr", function (status) { console.log("opened google NCR ", status); page.evaluate(function () { return document.title; }, function (result) { console.log('Page title is ' + result);

Set cookie for request in CasperJS

我只是一个虾纸丫 提交于 2019-12-07 00:37:41
问题 I want to load a page using CapserJS, but how can I send cookie which was exported from chrome's http request header at that page? Such as: "SUB=_2AkMjHt3gf8NhqwJRmPkQzG_qZIp_yA3EiebDAHzsJxJTHmMJ7IUyLkMN2K7WzRJvm-Tv3YY0xyZo; SUBP=0033WrSXqPxfM72-Ws9jqgMF55529P9D9WhCT_2hbJ1W1Cc4xfF-mFPo;" 回答1: There are multiple ways, but the easiest would be to use the page.addCookie or phantom.addCookie functions which PhantomJS provides, but you would have to set the domain (and path). Keep in mind that

Rendering rotated text to PDF in PhantomJS

此生再无相见时 提交于 2019-12-07 00:28:52
问题 I have a HTML page, which contains several pieces of text that are rotated using the following piece of CSS: .rotate { transform: rotate(90deg); transform-origin: 50% 50%; } When I pull up the page directly in the browser this renders as expected. When I render the page through PhantomJS, it seems to ignore the rotation. I upgraded to Phantom 2.0.0, but still the same issue. Is there any way to make this work? 回答1: I tested it with PhantomJS 1.9.18 in a node application. With -webkit

Java PhantomJSDriver disable all logs in console

邮差的信 提交于 2019-12-06 18:01:29
问题 I'm developing a small console app using Selenium and I need to turn off all logs from it. I have tried phantomJSDriver.setLogLevel(Level.OFF); but it does not work. I need help. How do I disable all logs in console application that is using Selenium and Phantomjs (GhostDriver)? 回答1: PhantomJSDriverService service = new PhantomJSDriverService.Builder() .usingPhantomJSExecutable(new File(VariableClass.phantomjs_file_path)) .withLogFile(null) .build(); 回答2: This one works for me.

PhantomJS Proxy when using Remote Webdriver?

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-06 14:57:08
I am trying to use selenium in python with PhantomJS. I am running a selenium hub server so am using webdriver.Remote to start a webdriver. The normal way to pass a proxy to PhantomJS is: service_args = [ '--proxy=127.0.0.1:9999', '--proxy-type=socks5', ] browser = webdriver.PhantomJS('../path_to/phantomjs',service_args=service_args) This won't workthough for webdriver.Remote(service_args=service_args) As webdriver.Remote takes only desired_capabilities, not service args, as a parameter. Is there any way to pass a proxy to PhantomJS as a desired_capibility? The typical way one would do so with

Visiting multiple urls using PhantomJS evaluating Error

这一生的挚爱 提交于 2019-12-06 14:51:43
问题 I have this beautiful code, all I want to make some pause between visits, so I add a 'setinterval', but this not works: var page = require('webpage').create(); // the urls to navigate to var urls = [ 'http://blogger.com/', 'https://github.com/', 'http://reddit.com/' ]; var i = 0; // the recursion function var genericCallback = setInterval(function () { return function (status) { console.log("URL: " + urls[i]); console.log("Status: " + status); // exit if there was a problem with the

uncss Error: C.UTF-8: not a valid language tag

僤鯓⒐⒋嵵緔 提交于 2019-12-06 14:18:44
Hi I am trying to use UNCSS to remove unused styles from CSS for the first time and I am getting the following error: Fontconfig warning: ignoring C.UTF-8: not a valid language tag /home/ubuntu/.nvm/v0.10.35/lib/node_modules/uncss/node_modules/bluebird/js/main/async.js:43 fn = function () { throw arg; }; ^ Error: Fontconfig warning: ignoring C.UTF-8: not a valid language tag at Socket.onStderr (/home/ubuntu/.nvm/v0.10.35/lib/node_modules/uncss/node_modules/phridge/lib/spawn.js:79:28) at Socket.emit (events.js:117:20) at Socket.<anonymous> (_stream_readable.js:765:14) at Socket.emit (events.js

PhantomJS how to render javascript in html string

时间秒杀一切 提交于 2019-12-06 14:17:59
问题 I'm trying to get PhantomJS to take an html string and then have it render the full page as a browser would (including execution of any javascript in the page source). I need the resulting html result as a string. I have seen examples of page.open which is of no use since I already have the page source in my database. Do I need to use page.open to trigger the javascript rendering engine in PhantomJS? Is there anyway to do this all in memory (ie.. without page.open making a request or reading

phantomjs with nohup not working

痴心易碎 提交于 2019-12-06 13:54:05
问题 I was trying to run phantomjs script via ssh using nohup command . But nohup threw an error which i found in nohup.out file. My command was --> nohup phantomjs example.js & phantomjs example.js run perfectly without nohup . I have also created a bash script to run this command with nohup but both time, I got this error --> events.js:72 throw er; // Unhandled 'error' event ^ Error: EBADF, read ** code of example.js var page = require('webpage').create(), system = require('system'), address,

Scraping dynamic page content phantomjs

此生再无相见时 提交于 2019-12-06 12:10:12
问题 My company is using a website that hosts all of our FAQ and customer questions. We have plans to go through and wipe out all of the old data and input new and the service does not have a backup, or archive option for questions we don't want to appear anymore. I've gone through and tried to scape the site using perl and mechanize, but I'm missing the customer comments on the page as they are loaded through ajax. I have looked at phantomjs and can get the pages to save to an image using an