page.open() function doesn't work properly for some URLs

限于喜欢 提交于 2019-12-25 04:53:19

问题


I am new in node. I have written a code using Node and Phantom to scrape a website. My code is working for google.com but not working for facebook because it is internally making an ajax request to other files to get the data.

var phantom = require('phantom');

phantom.create(function(ph) {
   return ph.createPage(function(page) {
       return page.open("https://facebook.com/", function(status) {
            if(status !== 'success'){
                console.log('Unable to load the url!');
                ph.exit();
            } else {
                setTimeout(function() {
                    return page.evaluate(function() {
                        return document.getElementsByTagName('body')[0].innerHTML;

                     }, function(result) {
                         console.log(result); //Log out the data.
                         ph.exit();
                     });
                }, 5000);
            };
        });
    });
});

So basically when I am executing my code then in case of facebook it is returning unable to load but but in case of google it is giving body response.

Can anybody tell me what changes should I do to get the result?

PhantomJS version: 1.9.0


回答1:


You should pass some commandline options to PhantomJS to not use SSLv3 but only TLSv1 and optionally ignore SSL errors (--web-security=false might also be helpful):

phantom.create('--ssl-protocol=tlsv1', '--ignore-ssl-errors=true', function(ph) {
    ...

The reason this might be an issue is that many sites have removed SSLv3 support because of the Poodle vulnerability.

This answer provides the solution for plain PhantomJS. My answer here elaborates on that issue in more detail for CasperJS.



来源:https://stackoverflow.com/questions/28226516/page-open-function-doesnt-work-properly-for-some-urls

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!