PHP multi cURL performance worse than sequential file_get_contents

随声附和 提交于 2019-12-05 19:05:18

1. Simple optimization

  • You should sleep about 2500 microseconds if curl_multi_select failed.
    Actually, it defintely fails sometimes for each execution.
    Without sleeping, your CPU resources get occupied by lots of while (true) { } loops.
  • If you do nothing after some (not all) of the requests have finished,
    you should let maximum timeout seconds larger.
  • Your code is written for old libcurls. As of libcurl version 7.2,
    the state CURLM_CALL_MULTI_PERFORM does not appear anymore.

So, the following code

$running = null;
$mrc = null;
do
{
    $mrc = curl_multi_exec( $master , $running );
}
while ( $mrc == CURLM_CALL_MULTI_PERFORM );
while ( $running && $mrc == CURLM_OK )
{
    if (curl_multi_select( $master ) != - 1)
    {
        do
        {
            $mrc = curl_multi_exec( $master , $running );
        }
        while ( $mrc == CURLM_CALL_MULTI_PERFORM );
    }
}

should be

curl_multi_exec($master, $running);
do
{
    if (curl_multi_select($master, 99) === -1)
    {
        usleep(2500);
        continue;
    }
    curl_multi_exec($master, $running);
} while ($running);

Note

The timeout value of curl_multi_select should be tuned only if you want to do something like...

curl_multi_exec($master, $running);
do
{
    if (curl_multi_select($master, $TIMEOUT) === -1)
    {
        usleep(2500);
        continue;
    }
    curl_multi_exec($master, $running);
    while ($info = curl_multi_info_read($master))
    {
        /* Do something with $info */
    }
} while ($running);

Otherwise, the value should be extreamly large.
(However, PHP_INT_MAX is too large; libcurl treats it as an invalid value.)

2. Easy experiment in one PHP process

I tested using my parallel cURL executor library: mpyw/co

(The prep. for is improper and it should be by, sorry for my poor English xD)

<?php 

require 'vendor/autoload.php';

use mpyw\Co\Co;

function four_sequencial_requests_for_one_hundread_people()
{
    for ($i = 0; $i < 100; ++$i) {
        $tasks[] = function () use ($i) {
            $ch = curl_init();
            curl_setopt_array($ch, [
                CURLOPT_URL => 'example.com',
                CURLOPT_FORBID_REUSE => true,
                CURLOPT_RETURNTRANSFER => true,
            ]);
            for ($j = 0; $j < 4; ++$j) {
                yield $ch;
            }
        };
    }
    $start = microtime(true);
    yield $tasks;
    $end = microtime(true);
    printf("Time of %s: %.2f sec\n", __FUNCTION__, $end - $start);
}

function requests_for_four_hundreds_people()
{
    for ($i = 0; $i < 400; ++$i) {
        $tasks[] = function () use ($i) {
            $ch = curl_init();
            curl_setopt_array($ch, [
                CURLOPT_URL => 'example.com',
                CURLOPT_FORBID_REUSE => true,
                CURLOPT_RETURNTRANSFER => true,
            ]);
            yield $ch;
        };
    }
    $start = microtime(true);
    yield $tasks;
    $end = microtime(true);
    printf("Time of %s: %.2f sec\n", __FUNCTION__, $end - $start);
}

Co::wait(four_sequencial_requests_for_one_hundread_people(), [
    'concurrency' => 0, // Zero means unlimited
]);

Co::wait(requests_for_four_hundreds_people(), [
    'concurrency' => 0, // Zero means unlimited
]);

I tried for five times to get the following results:

I also tried in reverse order (The 3rd request was kicked xD):

These results represent too many concurrent TCP connections actually decrease throughputs.

3. Advanced optimization

3-A. For different destinations

If you want to optimize for both few and many concurrent requests, the following dirty solution may help you.

  1. Share the number of requesters using apcu_add / apcu_fetch / apcu_delete.
  2. Switch methods(sequencial or parallel) by current value.

3-B. For the same destinations

CURLMOPT_PIPELINING will help you. This option bundles all HTTP/1.1 connections for the same destination into one TCP connection.

curl_multi_setopt($master, CURLMOPT_PIPELINING, 1);
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!