问题
Is there a 128 hashing algorithm (no matter if it is a crypto or non-crypto hashing) that guarantee no collision could occur?
If can guarantee my string would not exceed a specific length (is there such length? - I can guarantee a length less than 100 chars)
Thanks, J.B
回答1:
No you can't make such an algorithm. If you have a string with 100 characters, you have (let character be in 1..255
range)
256**100 == (2**8)**100 == 2**800
different strings (pontential collisions); 128 bit hash function has 2**128
different values only,
since
2**128 < 2**800
collisions are inevitable: pigeon hole principle
Edit: imagine that we have 128
-bit function; what is the maximum length of the string which can be collision free?
256**length = 2**128
(2**8)**length = 2**128
8 * length = 128
length = 16
So the maximum length is 16
(I've assumed for simplicity that string doesn't contain '\0'
). If string a unicode one (i.e. has chars in 1..65535
range)
65536**length = 2**128
(2**16)**length = 2**128
16 * length = 128
length = 8
回答2:
You can not make a mathematical guarantee that no collision occurs.
But you can make a practical guarantee, that the probability of a collision is so low that it's OK for you. One example is randomly generated UUIDs, where the probability of duplicates is so low that's it's not a problem in practise. The same for content addressable storage, that typically rely on practical (not mathematical) uniqueness of cryptographic hash functions.
Whether or not the hash algorithm is good enough for you depends on how many items you want to hash, and what probability of collision is acceptable for you. Then you can use the formula in the birthday problem to calculate if 128 bits are sufficient for you.
来源:https://stackoverflow.com/questions/50924076/128-bit-hash-without-collisions-guaranteed