Uniform distribution of truncated md5?

孤街浪徒 提交于 2019-11-30 03:13:33

Yes, not exhibiting any bias is a design requirement for a cryptographic hash. MD5 is broken from a cryptographic point of view however the distribution of the results was never in question.

If you still need to be convinced, it's not a huge undertaking to hash a bunch of files, truncate the output and use ent ( http://www.fourmilab.ch/random/ ) to analyze the result.

I wrote a little php-program to answer this question. It's not very scientific, but it shows the distribution for the first and the last 8 bits of the hashvalues using the natural numbers as hashtext. After about 40.000.000 hashes the difference between the highest and the lowest counts goes down to 1%, so I'd say the distribution is ok. I hope the code is more precise in explaining what was computed :-) Btw, with a similar program I found that the last 8 bits seem to be distributed slightly better than the first.

<?php
// Setup count-array:
for ($y=0; $y<16; $y++) {
  for ($x=0; $x<16; $x++) {
    $count[dechex($x).dechex($y)] = 0;
  }
}

$text = 1; // The text we will hash.
$hashCount = 0;
$steps = 10000;

while (1) {
  // Calculate & count a bunch of hashes:
  for ($i=0; $i<$steps; $i++) {   
    $hash = md5($text);
    $count[substr($hash, 0, 2)]++;
    $count[substr($hash, -2)]++;
    $text++;
  }
  $hashCount += $steps;

  // Output result so far:
  system("clear");
  $min = PHP_INT_MAX; $max = 0;
  for ($y=0; $y<16; $y++) {
    for ($x=0; $x<16; $x++) {  
      $n = $count[dechex($x).dechex($y)];
      if ($n < $min) $min = $n;
      if ($n > $max) $max = $n;
      print $n."\t";
    }
    print "\n";
  }
  print "Hashes: $hashCount, Min: $min, Max: $max, Delta: ".((($max-$min)*100)/$max)."%\n";
} 
?>
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!