How to run PhantomJS as a server and call it remotely?

梦想与她 提交于 2019-12-30 00:39:09

问题


This is probably a very basic question. I would like to run a headless browser PhantomJS as a server but not as a command line tool.

Once it is running I would like to call it remotely over HTTP. The only thing I need is to send a URL and get back the HTML output. I need it to generate HTML for an AJAX application to make it searchable.

Is it possible ?


回答1:


You can run PhantomJS perfectly fine as a webserver, because it has the Web Server Module. The examples folder contains for example a server.js example. This runs standalone without any dependencies (without node).

var page = require('webpage').create(),
    server = require('webserver').create();

var service = server.listen(port, function (request, response) {
    console.log('Request received at ' + new Date());
    // TODO: parse `request` and determine where to go
    page.open(someUrl, function (status) {
        if (status !== 'success') {
            console.log('Unable to post!');
        } else {
            response.statusCode = 200;
            response.headers = {
                'Cache': 'no-cache',
                'Content-Type': 'text/plain;charset=utf-8'
            };
            // TODO: do something on the page and generate `result`
            response.write(result);
            response.close();
        }
    });
});

If you want to run PhantomJS through node.js then this is also easily doable using the phantomjs-node which is a PhantomJS bridge for node.

var http = require('http');
var phantom = require('phantom');

phantom.create(function (ph) {
  ph.createPage(function (page) {
    http.createServer(function (req, res) {
      // TODO: parse `request` and determine where to go
      page.open(someURL, function (status) {
        res.writeHead(200, {'Content-Type': 'text/plain'});
        // TODO: do something on the page and generate `result`
        res.end(result);
      });
    }).listen(8080);
  });
});

Notes

You can freely use this as is as long you don't have multiple requests at the same time. If you do, then you either need to synchronize the requests (because there is only one page object) or you need to create a new page object on every request and close() it again when you're done.




回答2:


The easiest way is to make a python script or something simple to start the server and use python websockets to communicate with it, using a web form of sorts to query for a website and get the page source. Any automation can be done via cron jobs, or if you are on Windows, you may use the Tasks feature to autostart the python script.



来源:https://stackoverflow.com/questions/30713657/how-to-run-phantomjs-as-a-server-and-call-it-remotely

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!