Sending multiple goutte requests asynchronously

半腔热情 提交于 2019-12-08 08:52:21

问题


This is the code I am using

require_once 'goutte.phar';
use Goutte\Client;
$client = new Client();
for($i=0;$i<10;$i++){
     $crawler = $client->request('GET', 'http://website.com');
     echo '<p>'.$crawler->filterXpath('//meta[@property="og:description"]')->attr('content').'</p>';
     echo '<p>'.$crawler->filter('title')->text().'</p>';
}

This works but takes a lot of time to process? Is there any way to do it faster.


回答1:


For starters, there is nothing asynchronous about your code sample. Which means that your application will sequentially, perform a get request, wait for the response, parse the response and then loop back.

While Goutte uses Guzzle internally, it does not make use of Guzzles asynchronous capabilities.

To truly make your code asynchronous you will want to refer to the Guzzle Documentation on:

  • Sending Requests within a Pool
  • Asynchronous Response Handling

Your code sample above would result in something like:

require 'vendor/autoload.php' //assuming composer package management.

$client = new GuzzleHttp\Client();

$requests = [
    $client->createRequest('GET', $url1),
    $client->createRequest('GET', $url2),
    $client->createRequest('GET', $url3),
    $client->createRequest('GET', $url4),
    $client->createRequest('GET', $url5),
    $client->createRequest('GET', $url6),
    $client->createRequest('GET', $url7),
    $client->createRequest('GET', $url8),
    $client->createRequest('GET', $url9),
    $client->createRequest('GET', $url10),  
];

$options = [
    'complete' => [
        [
            'fn' => function (CompleteEvent $event) {
                $crawler = new Symfony\Component\DomCrawler\Crawler(null, $event->getRequest()->getUrl());
                $crawler->addContent($event->getResponse->getBody(), $event->getResponse()->getHeader('Content-Type'));
                echo '<p>'.$crawler->filterXpath('//meta[@property="og:description"]')->attr('content').'</p>';
                echo '<p>'.$crawler->filter('title')->text().'</p>';
            },
            'priority' => 0,    // Optional
            'once'     => false // Optional
        ]
    ]
];

$pool = new GuzzleHttp\Pool($client, $requests, $options);

$pool->wait();


来源:https://stackoverflow.com/questions/29454946/sending-multiple-goutte-requests-asynchronously

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!