Uniform distribution of truncated md5?

后端 未结 2 1058
小鲜肉
小鲜肉 2020-12-25 10:29

Can we say that a truncated md5 hash is still uniformly distributed?

To avoid misinterpretations: I\'m aware the chance of collisions is much greater th

相关标签:
2条回答
  • 2020-12-25 11:12

    Yes, not exhibiting any bias is a design requirement for a cryptographic hash. MD5 is broken from a cryptographic point of view however the distribution of the results was never in question.

    If you still need to be convinced, it's not a huge undertaking to hash a bunch of files, truncate the output and use ent ( http://www.fourmilab.ch/random/ ) to analyze the result.

    0 讨论(0)
  • 2020-12-25 11:32

    I wrote a little php-program to answer this question. It's not very scientific, but it shows the distribution for the first and the last 8 bits of the hashvalues using the natural numbers as hashtext. After about 40.000.000 hashes the difference between the highest and the lowest counts goes down to 1%, so I'd say the distribution is ok. I hope the code is more precise in explaining what was computed :-) Btw, with a similar program I found that the last 8 bits seem to be distributed slightly better than the first.

    <?php
    // Setup count-array:
    for ($y=0; $y<16; $y++) {
      for ($x=0; $x<16; $x++) {
        $count[dechex($x).dechex($y)] = 0;
      }
    }
    
    $text = 1; // The text we will hash.
    $hashCount = 0;
    $steps = 10000;
    
    while (1) {
      // Calculate & count a bunch of hashes:
      for ($i=0; $i<$steps; $i++) {   
        $hash = md5($text);
        $count[substr($hash, 0, 2)]++;
        $count[substr($hash, -2)]++;
        $text++;
      }
      $hashCount += $steps;
    
      // Output result so far:
      system("clear");
      $min = PHP_INT_MAX; $max = 0;
      for ($y=0; $y<16; $y++) {
        for ($x=0; $x<16; $x++) {  
          $n = $count[dechex($x).dechex($y)];
          if ($n < $min) $min = $n;
          if ($n > $max) $max = $n;
          print $n."\t";
        }
        print "\n";
      }
      print "Hashes: $hashCount, Min: $min, Max: $max, Delta: ".((($max-$min)*100)/$max)."%\n";
    } 
    ?>
    
    0 讨论(0)
提交回复
热议问题