Hash function for floats

前端未结

关注

 6  1603

野趣味 2020-12-03 07:38

I\'m currently implementing a hash table in C++ and I\'m trying to make a hash function for floats...

I was going to treat floats as integers by padding the decimal

6条回答

醉话见心 (楼主)

2020-12-03 07:44
You can of course represent a float as an int type of the same size to hash it, however this naive approach has some pitfalls you need to be careful of...

Simply converting to a binary representation is error prone since values which are equal wont necessarily have the same binary representation.

An obvious case: -0.0 wont match 0.0 for example. *

Further, simply converting to an int of the same size wont give very even distribution, which is often important (implementing a hash/set that uses buckets for example).

Suggested steps for implementation:
- filter out non-finite cases (nan, inf) and (0.0, -0.0 whether you need to do this explicitly or not depends on the method used).
- convert to an int of the same size
  (that is - use a union for example to represent the float as an int, not simply cast to an int).
- re-distribute the bits, (intentionally vague here!), this is basically a speed vs quality tradeoff. But if you have many values in a small range you probably don't want them to in a similar range too.
*: You may wan't to check for (nan and -nan) too. How to handle those exactly depends on your use case (you may want to ignore sign for all nan's as CPython does).

Python's _Py_HashDouble is a good reference for how you might hash a float, in production code (ignore the -1 check at the end, since that's a special value for Python).
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...