Probability of collision when using a 32 bit hash

China☆狼群 提交于 2019-11-26 09:16:41

问题


I have a 10 character string key field in a database. I\'ve used CRC32 to hash this field but I\'m worry about duplicates. Could somebody show me the probability of collision in this situation?

p.s. my string field is unique in the database. If the number of string fields is 1 million, what is probability of collision ?


回答1:


Duplicate of Expected collisions for perfect 32bit crc

The answer referenced this article: http://arstechnica.com/civis/viewtopic.php?f=20&t=149670

Found the image below from: http://preshing.com/20110504/hash-collision-probabilities




回答2:


In the case you cite, at least one collision is essentially guaranteed. The probability of at least one collision is about 1 - 3x10-51. The average number of collisions you would expect is about 116.

In general, the average number of collisions in k samples, each a random choice among n possible values is:

The probability of at least one collision is:

In your case, n = 232 and k = 106.

The probability of a three-way collision in your case is about 0.01. See the Birthday Problem.



来源:https://stackoverflow.com/questions/14210298/probability-of-collision-when-using-a-32-bit-hash

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!