Can not download html with phantomjs [closed]

我是研究僧i 提交于 2019-12-09 03:58:36

问题


I have 3 different files in my project and the layout is

  • phantomjs
  • -->phantomjs.js
  • -->phantomjs.exe
  • index.php

index.php:

$phantom_script = dirname(__FILE__). '\phantomjs\phantomjs.js';

$response =  exec ('\phantomjs\phantomjs.exe' . $phantom_script);

echo $response;

phantomjs\phantomjs.js

var webPage = require('webpage');
var page = webPage.create();

page.open('http://www.google.com', function(status) {
   console.log(page.content);
   phantom.exit();
});

回答1:


your usage oh phantomjs is correct according to the documentation. http://phantomjs.org/api/webpage/property/content.html

php exec method returns the last line only. Maybe that line is a white space. http://php.net/manual/fr/function.exec.php

You shall have a seond parameter &$output, sent by reference. It is an array containing the entire output.

A problem you may face later, the content could need be to evaluated before you try to read it s DOM document content. Using for example innerHTML of HTML tag, ie: $('html').html();

If the page does not have jquery, you may include it, see this example, https://github.com/ariya/phantomjs/blob/master/examples/phantomwebintro.js

Note also that google may actively desire to not let users scrap and save their search results. Not sure about that.



来源:https://stackoverflow.com/questions/30258097/can-not-download-html-with-phantomjs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!