What is the best 32bit hash function for short strings (tag names)?

前端 未结 8 1460
傲寒
傲寒 2020-12-12 16:43

What is the best 32bit hash function for relatively short strings?

Strings are tag names that consist of English letters, numbers, spaces and some additional charact

8条回答
  •  星月不相逢
    2020-12-12 16:58

    That depends on your hardware. On modern hardware, i.e. Intel/AMD with SSE4.2 or arm7 you should use the internal _mm_crc32_uxx intrinsics, as they are optimal for short strings. (For long keys also, but then better use Adler's threaded version, as in zlib)

    On old or unknown hardware, either run-time probe for the SSE4.2 or CRC32 feature or just use one if the simple good hash functions. E.g. Murmur2 or City

    An overview of quality and performance is here: https://github.com/rurban/smhasher#smhasher

    There are also all the implementations. Favored are https://github.com/rurban/smhasher/blob/master/crc32_hw.c and https://github.com/rurban/smhasher/blob/master/MurmurHash2.cpp

    If you know the keys in advance, use a perfect hash, not a hash function. E.g. gperf or my phash: https://github.com/rurban/Perfect-Hash#name

    Nowadays perfect hash generation via a c compiler is so fast, you can even create them on the fly, and dynaload it.

提交回复
热议问题