Is Perl's unpack() ever faster than substr()?

后端 未结 5 946
耶瑟儿~
耶瑟儿~ 2020-12-06 00:48

Several times I\'ve read that unpack() is faster than substr(), especially as the number of substrings increases. However, this benchmark suggests

5条回答
  •  醉酒成梦
    2020-12-06 01:39

    As a matter of fact, your benchmark is flawed, in a really, really interesting way, but what it boils down to is that what you are really comparing is the relative efficiency with which unpack vs. map can throw away a list, because Benchmark::cmpthese() is executing the functions in void context.

    The reason your substr comes out on top is this line of code in pp_ctl.c pp_mapwhile():

    if (items && gimme != G_VOID) {
    

    i.e. perl's map magically skips a bunch of work (namely allocating storage for the results of the map) if it knows it is being called in void context!

    (My hunch on the windows vs. other seen above is that windows-based perl memory allocation is awful, so skipping the allocation is a bigger savings there -- just a hunch, though, I don't have a windows box to play with. But the actual unpack implementation is straight-up C code, and shouldn't differ substantially from windows to other.)

    I have three different solutions for working around this issue and generating a more fair comparison:

    1. assign the list to an array
    2. loop over the list inside the function, and return nothing
    3. return a reference to the list (hiding the void context)

    Here's my version of %methods, with all three versions:

    my %methods = (
        unpack_assign => sub { my @foo = unpack $format_string, $data; return },
        unpack_loop => sub { for my $foo (unpack $format_string, $data) { } },
        unpack_return_ref => sub { return [ unpack $format_string, $data ] },
        unpack_return_array => sub { return unpack $format_string, $data },
    
        substr_assign => sub { my @foo = map {substr $data, $_, 1} 0 .. ($n_substrings - 1) },
        substr_loop => sub { for my $foo ( map {substr $data, $_, 1} 0 .. ($n_substrings - 1)) { } },
        substr_return_ref => sub { return [ map {substr $data, $_, 1} 0 .. ($n_substrings - 1) ] },
        substr_return_array => sub { return map { substr $data, $_, 1} 0 .. ($n_substrings - 1) },
    );
    

    And my results:

    $ perl -v
    
    This is perl, v5.10.0 built for x86_64-linux-gnu-thread-multi
    
    $ perl foo.pl
    10
                            Rate substr_assign substr_return_ref substr_loop unpack_assign unpack_return_ref unpack_loop unpack_return_array substr_return_array
    substr_assign       101915/s            --              -20%        -21%          -28%              -51%        -51%                -65%                -69%
    substr_return_ref   127224/s           25%                --         -1%          -10%              -39%        -39%                -57%                -62%
    substr_loop         128484/s           26%                1%          --           -9%              -38%        -39%                -56%                -61%
    unpack_assign       141499/s           39%               11%         10%            --              -32%        -32%                -52%                -57%
    unpack_return_ref   207144/s          103%               63%         61%           46%                --         -1%                -29%                -37%
    unpack_loop         209520/s          106%               65%         63%           48%                1%          --                -28%                -37%
    unpack_return_array 292713/s          187%              130%        128%          107%               41%         40%                  --                -12%
    substr_return_array 330827/s          225%              160%        157%          134%               60%         58%                 13%                  --
    100
                           Rate substr_assign substr_loop substr_return_ref unpack_assign unpack_return_ref unpack_loop unpack_return_array substr_return_array
    substr_assign       11818/s            --        -25%              -25%          -26%              -53%        -55%                -63%                -70%
    substr_loop         15677/s           33%          --               -0%           -2%              -38%        -40%                -51%                -60%
    substr_return_ref   15752/s           33%          0%                --           -2%              -37%        -40%                -51%                -60%
    unpack_assign       16061/s           36%          2%                2%            --              -36%        -39%                -50%                -59%
    unpack_return_ref   25121/s          113%         60%               59%           56%                --         -4%                -22%                -35%
    unpack_loop         26188/s          122%         67%               66%           63%                4%          --                -19%                -33%
    unpack_return_array 32310/s          173%        106%              105%          101%               29%         23%                  --                -17%
    substr_return_array 38910/s          229%        148%              147%          142%               55%         49%                 20%                  --
    1000
                          Rate substr_assign substr_return_ref substr_loop unpack_assign unpack_return_ref unpack_loop unpack_return_array substr_return_array
    substr_assign       1309/s            --              -23%        -25%          -28%              -52%        -54%                -62%                -67%
    substr_return_ref   1709/s           31%                --         -3%           -6%              -38%        -41%                -51%                -57%
    substr_loop         1756/s           34%                3%          --           -3%              -36%        -39%                -49%                -56%
    unpack_assign       1815/s           39%                6%          3%            --              -34%        -37%                -48%                -55%
    unpack_return_ref   2738/s          109%               60%         56%           51%                --         -5%                -21%                -32%
    unpack_loop         2873/s          120%               68%         64%           58%                5%          --                -17%                -28%
    unpack_return_array 3470/s          165%              103%         98%           91%               27%         21%                  --                -14%
    substr_return_array 4015/s          207%              135%        129%          121%               47%         40%                 16%                  --
    10000
                         Rate substr_assign substr_return_ref substr_loop unpack_assign unpack_return_ref unpack_loop unpack_return_array substr_return_array
    substr_assign       131/s            --              -23%        -27%          -28%              -52%        -55%                -63%                -67%
    substr_return_ref   171/s           30%                --         -5%           -6%              -38%        -42%                -52%                -57%
    substr_loop         179/s           37%                5%          --           -1%              -35%        -39%                -50%                -55%
    unpack_assign       181/s           38%                6%          1%            --              -34%        -38%                -49%                -55%
    unpack_return_ref   274/s          109%               60%         53%           51%                --         -6%                -23%                -32%
    unpack_loop         293/s          123%               71%         63%           62%                7%          --                -18%                -27%
    unpack_return_array 356/s          171%              108%         98%           96%               30%         21%                  --                -11%
    substr_return_array 400/s          205%              134%        123%          121%               46%         37%                 13%                  --
    100000
                          Rate substr_assign substr_return_ref substr_loop unpack_assign unpack_return_ref unpack_loop unpack_return_array substr_return_array
    substr_assign       13.0/s            --              -22%        -26%          -29%              -51%        -55%                -63%                -67%
    substr_return_ref   16.7/s           29%                --         -5%           -8%              -37%        -43%                -52%                -58%
    substr_loop         17.6/s           36%                5%          --           -3%              -33%        -40%                -50%                -56%
    unpack_assign       18.2/s           40%                9%          3%            --              -31%        -37%                -48%                -54%
    unpack_return_ref   26.4/s          103%               58%         50%           45%                --         -9%                -25%                -34%
    unpack_loop         29.1/s          124%               74%         65%           60%               10%          --                -17%                -27%
    unpack_return_array 35.1/s          170%              110%         99%           93%               33%         20%                  --                -12%
    substr_return_array 39.7/s          206%              137%        125%          118%               50%         36%                 13%                  --
    

    So back to the original question: "is unpack() ever faster than substr()?" Answer: always, for this type of application -- unless you don't care about the return values ;)

提交回复
热议问题