Is it possible to use a Perl hash in a manner that has `O(log(n))` lookup and insertion?

前端 未结 3 1480
小蘑菇
小蘑菇 2021-01-06 23:41

Is it possible to use a Perl hash in a manner that has O(log(n)) lookup and insertion?

By default, I assume the lookup is O(n) since it\'s

3条回答
  •  难免孤独
    2021-01-06 23:58

    Anybody who thinks that hash insert or lookup time is O(1) on modern hardware is extraordinary naive. Measuring get of same value is plain wrong. Following results will give you much better picture what's going on.

    Perl version 5.010001
                Rate 10^6 keys 10^5 keys 10^1 keys 10^4 keys 10^3 keys 10^2 keys
    10^6 keys 1.10/s        --      -36%      -64%      -67%      -68%      -69%
    10^5 keys 1.73/s       57%        --      -43%      -49%      -50%      -52%
    10^1 keys 3.06/s      177%       76%        --      -10%      -12%      -15%
    10^4 keys 3.40/s      207%       96%       11%        --       -3%       -5%
    10^3 keys 3.49/s      216%      101%       14%        3%        --       -3%
    10^2 keys 3.58/s      224%      107%       17%        6%        3%        --
    

    Above result is measured on system with 5MB CPU cache. Note that performance drops significantly from 3.5M/s to 1M/s lookups. Anyway it is still very fast and for some cases you can beat even systems like RDBMS if you know what you are doing. You can measure your system using following code:

    #!/usr/bin/perl
    
    use strict;
    use warnings;
    
    use Benchmark;
    
    print "Perl version $]\n";
    
    my %subs;
    for my $n ( 1 .. 6 ) {
        my $m = 10**$n;
        keys( my %h ) = $m;    #preallocated the hash so it doesn't have to keep growing
        my $k = "a";
        %h = ( map { $k++ => 1 } 1 .. $m );
        my $l = 10**( 6 - $n );
        my $a;
        $subs{"10^$n keys"} = sub {
            for ( 1 .. $l ) {
                $a = $h{$_} for keys %h;
            }
        };
    }
    
    Benchmark::cmpthese -3, \%subs;
    

    You shouldn't also forget that hash lookup time depends on key length. Simply, there is not real technology with O(1) access time. Each known real technology has O(logN) access time in the best. There are only systems which have O(1) access time because are limiting their maximal N and are degrading its performance for low N. It is how things works in real world and it is reason why someone making algorithms like Judy Array and evolution becomes worse and worse.

提交回复
热议问题