问题
I'm trying to read a html page using simple html dom for which an login authorization is needed.
for example: http://example.com/login/ is the login page and http://example.com/page/ is where i should parse the html.
So i used curl to do the login and simple html dom to parse.
But i dont know whether my page login or not, because when i display the response from curl its the login page contents!!
I searched through stack in allmost all related questions for many hours but i couldnt find what is going wrong.
below is my code
<?php
$curlPost['username']="username";
$curlPost['password']="pass";
$curlPost['token']="xxxxxxxxxx";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL , "http://example.com/login/");
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $curlPost);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookies.txt");
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookies.txt");
$response= curl_exec ($ch);
curl_close($ch);
And the code to retrieve the html page
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL , "http://example.com/page/");
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13");
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookies.txt");
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookies.txt");
$reponse= curl_exec ($ch);
curl_close($ch);
echo $response;
?>
Below is what i get in response in the top of my page:
HTTP/1.1 302 Found
Date: Wed, 28 Jan 2015 06:59:44 GMT
Server: Apache
X-Powered-By: PHP/5.3.3
Cache-Control: no-cache
Location: /login
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
HTTP/1.1 200 OK
Date: Wed, 28 Jan 2015 06:59:45 GMT
Server: Apache
X-Powered-By: PHP/5.3.3
Cache-Control: no-cache
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
followed by the login page html contents.
Anyone can advise me on what i'm doing wrong.
I'm running this in my localhost with the destination hosted in server.
And I didn't see any changes happening to "cookies.txt" file.
Many thanks.
回答1:
That looks like normal output to me. If you don't want the headers, don't set CURLOPT_HEADER
来源:https://stackoverflow.com/questions/28186768/authorize-with-curl-and-parse-using-simple-html-dom-not-working