问题
I am new in node. I have written a code using Node and Phantom to scrape a website. My code is working for google.com but not working for facebook because it is internally making an ajax request to other files to get the data.
var phantom = require('phantom');
phantom.create(function(ph) {
return ph.createPage(function(page) {
return page.open("https://facebook.com/", function(status) {
if(status !== 'success'){
console.log('Unable to load the url!');
ph.exit();
} else {
setTimeout(function() {
return page.evaluate(function() {
return document.getElementsByTagName('body')[0].innerHTML;
}, function(result) {
console.log(result); //Log out the data.
ph.exit();
});
}, 5000);
};
});
});
});
So basically when I am executing my code then in case of facebook it is returning unable to load but but in case of google it is giving body response.
Can anybody tell me what changes should I do to get the result?
PhantomJS version: 1.9.0
回答1:
You should pass some commandline options to PhantomJS to not use SSLv3 but only TLSv1 and optionally ignore SSL errors (--web-security=false
might also be helpful):
phantom.create('--ssl-protocol=tlsv1', '--ignore-ssl-errors=true', function(ph) {
...
The reason this might be an issue is that many sites have removed SSLv3 support because of the Poodle vulnerability.
This answer provides the solution for plain PhantomJS. My answer here elaborates on that issue in more detail for CasperJS.
来源:https://stackoverflow.com/questions/28226516/page-open-function-doesnt-work-properly-for-some-urls