MurmurHash - what is it?

妖精的绣舞 提交于 2019-11-26 18:46:00

问题


I've been trying to get a high level understanding of what MurmurHash does.

I've read a basic description but have yet to find a good explanation of when to use it and why. I know its very fast but want to know a bit more.

I asked a related question about how I could fit a UUID into a Redis bitset, and someone suggested using MurmurHash. It works but I'd like to understand the risks/benefits.


回答1:


Murmur is a family of good general purpose hashing functions, suitable for non-cryptographic usage. As stated by Austin Appleby, MurmurHash provides the following benefits:

  • simple (in term of number of generated assembly instructions).
  • good distribution (passing chi-squared tests for practically all keysets & bucket sizes.
  • good avalanche behavior (max bias of 0.5%).
  • good collision resistance (passes Bob Jenkin's frog.c torture-test. No collisions possible for 4-byte keys, no small (1- to 7-bit) differentials).
  • great performance on Intel/AMD hardware, good tradeoff between hash quality and CPU consumption.

You can certainly use it to hash UUIDs (like any other advanced hashing functions: CityHash, Jenkins, Paul Hsieh's, etc ...). Now, a Redis bitset is limited to 4 GB bits (512 MB). So you need to reduce 128 bits of data (UUID) to 32 bits (hashed value). Whatever the quality of the hashing function, there will be collisions.

Using an engineered hash function like Murmur will maximize the quality of the distribution, and minimize the number of collisions, but it offers no other guarantee.

Here are some links comparing the quality of general purpose hash functions:

http://www.azillionmonkeys.com/qed/hash.html

http://www.strchr.com/hash_functions

http://blog.aggregateknowledge.com/2011/12/05/choosing-a-good-hash-function-part-1/

http://blog.aggregateknowledge.com/2011/12/29/choosing-a-good-hash-function-part-2/

http://blog.aggregateknowledge.com/2012/02/02/choosing-a-good-hash-function-part-3/




回答2:


I know I am reply late, but it may help any one else...

Murmur hashing is a non cryptographic hash function which is used for hash based look-ups , it uses 3 basic operations as a whole Multiply, Rotate and XOR. It uses multiple constants which are just there to make it good hash function by passing 2 basic tests.

  1. Avalanche Test
  2. Chi-Squared Test

You can watch this video, which I made, for the detail explanation of Murmur Hashing.



来源:https://stackoverflow.com/questions/11899616/murmurhash-what-is-it

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!