Very low cost hash function

白昼怎懂夜的黑 提交于 2019-11-30 02:26:06

The canonical form of that is h(x) = (a*x + b) mod n, where a and b are constants and n is the size of your hash table. You want to make n a prime number, to get optimal(ish) distribution.

Note that this is sensitive to certain kind of distributions -- for example, just doing x mod n is mostly relying on randomness of low-order bits; if they are not random in your set, you will get fairly significant skew.

Bob Jenkins has designed several very good hashing functions; here's one specifically designed to be simple to implement in hardware: http://burtleburtle.net/bob/hash/nandhash.html

For a lot of different hash functions, design discussions, etc, see the rest of the site: http://burtleburtle.net/bob/hash/

CRC?

There is already a lot of hardware support for this too.

I believe this is the best possible hash for this problem (faster than modulo, better distribution), given that all your numbers in 0..N have the same probability:

h = z * n / N;

Where all values are integers, so you have an integer division. This way each value between 0..N is mapped to exactly the same number of values in n.

For example, when n=3 and N=7 (values 3 and 7 not included in the ranges), the hashes are this:

z * n / N = hash
----------------
0 * 3 / 7 = 0
1 * 3 / 7 = 0
2 * 3 / 7 = 0
3 * 3 / 7 = 1
4 * 3 / 7 = 1
5 * 3 / 7 = 2
6 * 3 / 7 = 2

So each hash values are used equally often, just off by 1. Just take care that n*(N-1) does not overflow.

If N is a power of 2, you can replace the division by shifting. e.g. if N=256:

h = (z * n) >> 8;

Rewire bits in random order and take lower log2(n) bits

Or just take lower log2(n) bits if your data is evenly distributed.

If you're truly talking hardware (vs. software, or hardware implementation of software), and your number of hash buckets n can be written as n = 2m - 1, the easiest is probably a maximum-length linear feedback shift register (LFSR) of which CRC is an instance.

Here's one way you could use an m-bit shift register to create a hash of a data packet (make sure all data is represented consistently as a K-bit string, if you have shorter strings then pad one end with zeros):

  1. Initialize the state of the LFSR (CRC-32 uses all 1's; all zeros is probably bad)
  2. Shift in the bits of your data
  3. (Optional) Shift in an additional j zeros (j between m and 2m is probably a good choice); this adds some additional hashing to reduce direct correlation between input/output bits
  4. Use the contents of the m-bit shift register as your hashed value.
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!