Content from large number of web pages into array (PHP)

独自空忆成欢 提交于 2019-12-11 09:16:06

问题


I have an array ($x) containing 748 URL:s. Now, I want to fetch a specific part from each page and put all those parts into a new array. That is, an array containing 748 pieces of text, each from a different URL defined in array $x.

Here's the code I've got so far:

foreach ($x as $row) {
    $contents = file_get_contents($row);

    $regex = '/delimiter_start(.*?)delimiter_end/s';
    preg_match_all($regex, $contents, $output);
}

If I var_dump $output I get a strange array that endlessly keeps looping content until I press stop in my browser. The array looks like this:

array(2) {
[0]=>
array(1) {
[0]=>
string(4786) "string 1. The one I want from the first page."}

[1]=>
array(1) {
[0]=>
string(4755) "string 1 again"}}

array(2) {
[0]=>
array(1) {
[0]=>
string(8223) "string 2. The one I want from the second page."}

[1]=>
array(1) {
[0]=>
string(8192) "string 2 again"}}

EDIT: I can actually retrieve the results I'm looking for with $output[0]. But how do I create a new array with the same contents as $output[0] that is accessible outside the loop?


回答1:


The output you are seeing from preg_match_all is standard, this is because you receive the matches and the full matched content in the output array.

$lines = Array();
foreach ($x as $row) {
$contents = file_get_contents($row);

$regex = '/delimiter_start(.*?)delimiter_end/s';
preg_match_all($regex, $contents, $output);
    if (is_array($output) && isset($output[0]) && !empty($output[0])){
    $lines[] = $output[0];
}
}
var_dump($lines);


来源:https://stackoverflow.com/questions/13953064/content-from-large-number-of-web-pages-into-array-php

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!