发表新帖

发表新帖

Is it possible to use a Perl hash in a manner that has `O(log(n))` lookup and insertion?

前端未结

关注

 3  1480

小蘑菇 2021-01-06 23:41

Is it possible to use a Perl hash in a manner that has O(log(n)) lookup and insertion?

By default, I assume the lookup is O(n) since it\'s

3条回答

难免孤独 (楼主)

2021-01-06 23:58
Anybody who thinks that hash insert or lookup time is O(1) on modern hardware is extraordinary naive. Measuring get of same value is plain wrong. Following results will give you much better picture what's going on.
```
Perl version 5.010001
            Rate 10^6 keys 10^5 keys 10^1 keys 10^4 keys 10^3 keys 10^2 keys
10^6 keys 1.10/s        --      -36%      -64%      -67%      -68%      -69%
10^5 keys 1.73/s       57%        --      -43%      -49%      -50%      -52%
10^1 keys 3.06/s      177%       76%        --      -10%      -12%      -15%
10^4 keys 3.40/s      207%       96%       11%        --       -3%       -5%
10^3 keys 3.49/s      216%      101%       14%        3%        --       -3%
10^2 keys 3.58/s      224%      107%       17%        6%        3%        --
```
Above result is measured on system with 5MB CPU cache. Note that performance drops significantly from 3.5M/s to 1M/s lookups. Anyway it is still very fast and for some cases you can beat even systems like RDBMS if you know what you are doing. You can measure your system using following code:
```
#!/usr/bin/perl

use strict;
use warnings;

use Benchmark;

print "Perl version $]\n";

my %subs;
for my $n ( 1 .. 6 ) {
    my $m = 10**$n;
    keys( my %h ) = $m;    #preallocated the hash so it doesn't have to keep growing
    my $k = "a";
    %h = ( map { $k++ => 1 } 1 .. $m );
    my $l = 10**( 6 - $n );
    my $a;
    $subs{"10^$n keys"} = sub {
        for ( 1 .. $l ) {
            $a = $h{$_} for keys %h;
        }
    };
}

Benchmark::cmpthese -3, \%subs;
```
You shouldn't also forget that hash lookup time depends on key length. Simply, there is not real technology with O(1) access time. Each known real technology has O(logN) access time in the best. There are only systems which have O(1) access time because are limiting their maximal N and are degrading its performance for low N. It is how things works in real world and it is reason why someone making algorithms like Judy Array and evolution becomes worse and worse.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题