Get http-statuscode without body using cURL?

一世执手 提交于 2019-12-11 02:40:30

问题


I want to parse a lot of URLs to only get their status codes.

So what I did is:

$handle = curl_init($url -> loc);
             curl_setopt($handle, CURLOPT_RETURNTRANSFER, true);
             curl_setopt($handle, CURLOPT_HEADER  , true);  // we want headers
             curl_setopt($handle, CURLOPT_NOBODY  , true);
             curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
             $response = curl_exec($handle);
             $httpCode = curl_getinfo($handle, CURLINFO_HTTP_CODE);
             curl_close($handle);

But as soon as the "nobody"-option is set to true, the returned status codes are incorrect (google.com returns 302, other sites return 303).

Setting this option to false is not possible because of the performance loss.

Any ideas?


回答1:


The default HTTP request method for curl is GET. If you want only the response headers, you can use the HTTP method HEAD.

curl_setopt($handle, CURLOPT_CUSTOMREQUEST, 'HEAD');

According to @Dai's answer, the NOBODY is already using the HEAD method. So the above method will not work.

Another option would be to use fsockopen to open a connection, write the headers using fwrite. Read the response using fgets until the first occurrence of \r\n\r\n to get the complete header. Since you need only the status code, you just need to read the first 13 characters.

<?php
$fp = fsockopen("www.google.com", 80, $errno, $errstr, 30);
if ($fp) {
    $out = "GET / HTTP/1.1\r\n";
    $out .= "Host: www.google.com\r\n";
    $out .= "Accept-Encoding: gzip, deflate, sdch\r\n";
    $out .= "Accept-Language: en-GB,en-US;q=0.8,en;q=0.6\r\n";
    $out .= "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 Safari/537.36\r\n";
    $out .= "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\n";
    $out .= "Connection: Close\r\n\r\n";
    fwrite($fp, $out);
    $tmp = explode(' ', fgets($fp, 13));
    echo $tmp[1];
    fclose($fp);
}



回答2:


cURL's nobody option has it use the HEAD HTTP verb, I'd wager the majority of non-static web applications I the wild don't handle this verb correctly, hence the problems you're seeing with different results. I suggest making a normal GET request and discarding the response.




回答3:


i suggest get_headers() instead:

<?php
$url = 'http://www.example.com';

print_r(get_headers($url));

print_r(get_headers($url, 1));
?>


来源:https://stackoverflow.com/questions/27236135/get-http-statuscode-without-body-using-curl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!