PHP - Is the there a safe way to perform deep recursion?

允我心安 提交于 2019-12-08 05:13:49

问题


Im talking about performing a deep recursion for around 5+ mins, something that you may have a crawler perform. in order to extract url links and and sub-url links of pages

it seems that deep recursion in PHP does not seem realistic

e.g.

getInfo("www.example.com");

function getInfo($link){
   $content = file_get_content($link)

   if($con = $content->find('.subCategories',0)){
      echo "go deeper<br>";
      getInfo($con->find('a',0)->href);
   }

   else{
      echo "reached deepest<br>";
   }
}

回答1:


Doing something like this with recursion is actually a bad idea in any language. You cannot know how deep that crawler will go so it might lead to a Stack Overflow. And if not it still wastes a bunch of memory for the huge stack since PHP has no tail-calls (not keeping any stack information unless necessary).

Push the found URLs into a "to crawl" queue which is checked iteratively:

$queue = array('www.example.com');
$done = array();
while($queue) {
    $link = array_shift($queue);
    $done[] = $link;
    $content = file_get_contents($link);
    if($con = $content->find('.subCategories', 0)) {
        $sublink = $con->find('a', 0)->href;
        if(!in_array($sublink, $done) && !in_array($sublink, $queue)) {
            $queue[] = $sublink;
        }
    }
}


来源:https://stackoverflow.com/questions/11278010/php-is-the-there-a-safe-way-to-perform-deep-recursion

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!