问题
So I'm trying to grab some images from another site, the problem is each image is on a different page
IE: id/1, id/2, id/3 etc etc
so far I have the code below which can grab an image from the single URL given using:
$returned_content = get_data('http://somedomain.com/id/1/');
but need to make the line above become an array (I guess) so it will grab the image from page 1 then go on to grab the next image on page 2 then page 3 etc etc automatically
function get_data($url){
$ch = curl_init();
$timeout = 5;
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,$timeout);
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$returned_content = get_data('http://somedomain.com/id/1/');
if (preg_match_all("~http://somedomain.com/images/(.*?)\.jpg~i", $returned_content, $matches)) {
$src = 0;
foreach ($matches[1] as $key) {
if(++$src > 1) break;
$out = $key;
}
$file = 'http://somedomain.com/images/' . $out . '.jpg';
$dir = 'photos';
$imgurl = get_data($file);
file_put_contents($dir . '/' . $out . '.jpg', $imgurl);
echo 'done';
}
As always all help is appreciated and thanks in advance.
回答1:
This was pretty confusing, because it sounded like you were only interested in saving one image per page. But then the code makes it look like you're actually trying to save every image on each page. So it's entirely possible I completely misunderstood... But here goes.
Looping over each page isn't that difficult:
$i = 1;
$l = 101;
while ($i < $l) {
$html = get_data('http://somedomain.com/id/'.$i.'/');
getImages($html);
$i += 1;
}
The following then assumes that you're trying to save all the images on that particular page:
function getImages($html) {
$matches = array();
$regex = '~http://somedomain.com/images/(.*?)\.jpg~i';
preg_match_all($regex, $html, $matches);
foreach ($matches[1] as $img) {
saveImg($img);
}
}
function saveImg($name) {
$url = 'http://somedomain.com/images/'.$name.'.jpg';
$data = get_data($url);
file_put_contents('photos/'.$name.'.jpg', $data);
}
来源:https://stackoverflow.com/questions/7747875/grab-download-images-from-multiple-pages-using-php-preg-match-all-curl