Truncating an md5 hash, How do I calculate the odds of a collision occurring?

雨燕双飞 提交于 2019-12-12 08:18:47

问题


I want to truncate an md5 hash to about half size. How much does that increase the odds of collisions? if I'm dealing with around 500 000 generations, should I be worried about a collision? what about 1m generations.


回答1:


The math you're looking for is on Wikipedia's birthday attack page.

We consider the following experiment. From a set of H values we choose n values uniformly at random thereby allowing repetitions. Let p(n; H) be the probability that during this experiment at least one value is chosen more than once. This probability can be approximated as

With 128 bits the chance of a collision among 500,000 hash values is around 10-28. If you halve the size of the collision space then the chance of collision is around 10-9. That is, even though the chance is vastly greater it's still very, very low. It depends on how critical it is that there be no collisions. 10-9 is on the order of one in a billion, so while extremely unlikely it's within the realm of possibility.

For reference:

1028 = 10 octillion = 10 billion billion billion
109 = 1 billion




回答2:


There's an interesting mathematical problem called the birthday problem that deals with that kind of situation. The fact is that the more entries you push in, the higher the chances to have a collision.

Following the table posted on the above link, assuming your digests are 64 bits each (since a single MD5 hash is 128 bits) and that MD5 have a uniform distribution, there is a very low chance that two hashes will collide. It becomes significant (1% chance or more) at 610,000,000 entries.



来源:https://stackoverflow.com/questions/2256423/truncating-an-md5-hash-how-do-i-calculate-the-odds-of-a-collision-occurring

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!