Why is lookahead (sometimes) faster than capturing?

后端 未结 2 1055
粉色の甜心
粉色の甜心 2020-12-19 07:38

This question is inspired by this other one.

Comparing s/,(\\d)/$1/ to s/,(?=\\d)//: the former uses a capture group to replace only the di

2条回答
  •  一个人的身影
    2020-12-19 08:26

    As always, when you want to know which of two pieces of code works faster, you have to test it:

    #!/usr/bin/perl
    
    use 5.012;
    use warnings;
    use Benchmark qw;
    
    say "Extreme ,,,:";
    my $Text = ',' x (my $LEN = 512);
    cmpthese my $TIME = -10, my $CMP = {
        capture => \&capture,
        lookahead => \&lookahead,
    };
    
    say "\nExtreme ,0,0,0:";
    $Text = ',0' x $LEN;
    cmpthese $TIME, $CMP;
    
    my $P = 0.01;
    say "\nMixed (@{[$P * 100]}% zeros):";
    my $zeros = $LEN * $P;
    $Text = ',' x ($LEN - $zeros) . ',0' x $zeros;
    cmpthese $TIME, $CMP;
    
    sub capture {
        local $_ = $Text;
        s/,(\d)/$1/;
    }
    
    sub lookahead {
        local $_ = $Text;
        s/,(?=\d)//;
    }
    

    The benchmark tests three different cases:

    1. Only ','
    2. Only ',0'
    3. 1% ',0', rest ','

    On my machine and with my perl version, it produces these results:

    Extreme ,,,:
                 Rate   capture lookahead
    capture   23157/s        --       -1%
    lookahead 23362/s        1%        --
    
    Extreme ,0,0,0:
                   Rate   capture lookahead
    capture    419476/s        --      -65%
    lookahead 1200465/s      186%        --
    
    Mixed (1% zeros):
                 Rate   capture lookahead
    capture   22013/s        --       -4%
    lookahead 22919/s        4%        --
    

    These results substantiates the assumption that the look-ahead version is significantly faster than the capturing, except for the case of almost only commas. And it is indeed not very surprising as PSIAlt already explained in his comment.

    regards, Matthias

提交回复
热议问题