Scrape a particular area of site content With a Secure Login

这一生的挚爱 提交于 2019-12-11 16:24:25

问题


I am trying to scrape some particular text of a website which is login secured here is the tutorial on this using curl http://www.digeratimarketing.co.uk/2008/12/16/curl-page-scraping-script/

But I am unable to implement this into my curl codes here is my curl script

$url = "http://aftabcurrency.com/login_script.php";

$ch = curl_init();    
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 

curl_setopt($ch, CURLOPT_URL, $url); 
$cookie = 'cookies.txt';
$timeout = 30;

curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_TIMEOUT,         10); 
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT,  $timeout );
curl_setopt($ch, CURLOPT_COOKIEJAR,       $cookie);
curl_setopt($ch, CURLOPT_COOKIEFILE,      $cookie);

curl_setopt ($ch, CURLOPT_POST, 1); 
curl_setopt ($ch,CURLOPT_POSTFIELDS,"user_name=user&user_password=pass&passcode=code");     

$result = curl_exec($ch); 
curl_close($ch); 
$source = $result;
if(preg_match("/(CC3300\">)(.*?)(<\/font>)/is",$source,$found)){
echo $found[2];
}else{
echo "Text not found.";
}

for example in aftabcurrency.com I only wish to scrap only "Our Services Matters!" (this text changes every day)


回答1:


what I would do is to "cut out" a text between start and beginning... in the source the text is starting by a text color 613A75 and ands with the closing < /font> tag.. here is a regex solution:

$source = file_get_contents("http://aftabcurrency.com/index.php");
if(preg_match("/(613A75\">)(.*?)(<\/font>)/is",$source,$found)){
echo $found[2];
}else{
echo "Text not found.";
}

if you want to do this with your text inside member area, add my source here to your source and replace the $source = file_get_contents... with $source = $result

there is also other way to do this, DomDocument and xpath or simple strpos / strstr / substr php functions.



来源:https://stackoverflow.com/questions/11184447/scrape-a-particular-area-of-site-content-with-a-secure-login

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!