Having Trouble Using CURL and PHP to Get Google Search Results Through a Proxy

纵然是瞬间 提交于 2019-12-08 10:45:12

问题


This script works fine when getting google.com but not with google.com/search?q=test. When I don't use CURLOPT_FOLLOWLOCATION, I get a 302 Moved. When I do use it, I get a page asking me to input a captcha. I've tried several different U.S. based proxies and have varied the user agent string. Is there something I'm missing here?

function my_fetch($url,$proxy,$user_agent='Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.8) Gecko/2009032609 Firefox/3.0.8') 
{ 
    $ch = curl_init(); 
    curl_setopt ($ch, CURLOPT_URL, $url); 
    curl_setopt ($ch, CURLOPT_PROXY, $proxy);
    curl_setopt ($ch, CURLOPT_USERAGENT, $user_agent); 
    curl_setopt ($ch, CURLOPT_HEADER, 0);
    curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt ($ch, CURLOPT_REFERER, 'http://www.google.com/'); 
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);

    curl_setopt ($ch, CURLOPT_TIMEOUT, 20);
    $result = curl_exec ($ch); 
    curl_close ($ch); 
    return $result; 
}

$url = 'http://www.google.com/search?q=test';

$proxy = '152.26.53.4:80';
echo my_fetch($url,$proxy);

Please don't respond with suggestions to use the API instead. The API is not sufficient for my needs.


回答1:


Google is No Longer for cURL.

Google is no longer giving access through Curl, it may gives you 302 Moved message, If you want to use it, you have to use the API for it.

Thanks




回答2:


You can try to do that with PhantomJS:

var page = require("webpage").create();
var homePage = "http://www.google.com/";

page.open(homePage);
page.onLoadFinished = function(status) {
 var url = page.url;

console.log("Status:  " + status);
console.log("Loaded:  " + url);


page.includeJs("http://code.jquery.com/jquery-1.8.3.min.js", function() {
  console.log("Loaded jQuery!");
  page.evaluate(function() {
    var searchBox = $(".lst");
    var searchForm = $("form");

    searchBox.val("your query");
    searchForm.submit();
  });
});

window.setTimeout(
        function () {
          page.render( 'google.png' );
          phantom.exit(0);
        },
        1000 // wait 5,000ms (5s)
      );


};


来源:https://stackoverflow.com/questions/8609962/having-trouble-using-curl-and-php-to-get-google-search-results-through-a-proxy

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!