Adler32 Repeating Very Quickly

和自甴很熟 提交于 2019-12-05 01:00:34

问题


I'm using the adler32 checksum algorithm to generate a number from a database id. So, when I insert a row into the database, I take the identity of that row and use it to create the checksum. The problem that I'm running into is that I just generated a repeat checksum after only 207 inserts into the database. This is much much faster than I expected. Here is my code:

String dbIdStr = Long.toString(dbId);
byte[] bytes = dbIdStr.getBytes();
Checksum checksum = new Adler32();
checksum.update(bytes, 0, bytes.length);
result = checksum.getValue();

Is there something wrong with what/how I'm doing? Should I be using a different method to create unique strings? I'm doing this because I don't want to use the db id in a url... a change to the structure of the db will break all the links out there in the world.

Thanks!


回答1:


You should not be using Adler-32 as a hash code generator. That's not what it's for. You should use an algorithm that has good hash properties, which, among other things minimizes the probability of collisions.

You can simply use Java's hashCode method (on any object). For the String object, the hash code is the sum of the byte values of string times successive powers of 31. There can be collisions with very short strings, but it's not a horrible algorithm. It's definitely a lot better than Adler-32 as a hash algorithm.

The suggestions to use a cryptographically secure hash function (like SHA-256) are certainly overkill for your application, both in terms of execution time and hash code size. You should try Java's hashCode and see how many collisions you get. If it seems much more frequent than you'd expect for a 2-n probability (where n is the number of bits in the hash code), then you can override it with a better one. You can find a link here for decent Java hash functions.




回答2:


Try and use a secure hash function like SHA-256. If you ever find a collision for any data that is not binary equal, you'll get $1000 on your bank account, with compliments. Offer ends if/when SHA-2 is cracked and you enter a collision deliberately. That said, the output is 32 bytes instead of 32 bits.



来源:https://stackoverflow.com/questions/11597762/adler32-repeating-very-quickly

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!