Hash function for floats

前端 未结 6 1603
野趣味
野趣味 2020-12-03 07:38

I\'m currently implementing a hash table in C++ and I\'m trying to make a hash function for floats...

I was going to treat floats as integers by padding the decimal

6条回答
  •  醉话见心
    2020-12-03 07:44

    You can of course represent a float as an int type of the same size to hash it, however this naive approach has some pitfalls you need to be careful of...

    Simply converting to a binary representation is error prone since values which are equal wont necessarily have the same binary representation.

    An obvious case: -0.0 wont match 0.0 for example. *

    Further, simply converting to an int of the same size wont give very even distribution, which is often important (implementing a hash/set that uses buckets for example).

    Suggested steps for implementation:

    • filter out non-finite cases (nan, inf) and (0.0, -0.0 whether you need to do this explicitly or not depends on the method used).
    • convert to an int of the same size
      (that is - use a union for example to represent the float as an int, not simply cast to an int).
    • re-distribute the bits, (intentionally vague here!), this is basically a speed vs quality tradeoff. But if you have many values in a small range you probably don't want them to in a similar range too.

    *: You may wan't to check for (nan and -nan) too. How to handle those exactly depends on your use case (you may want to ignore sign for all nan's as CPython does).

    Python's _Py_HashDouble is a good reference for how you might hash a float, in production code (ignore the -1 check at the end, since that's a special value for Python).

提交回复
热议问题