What is the best 32bit hash function for relatively short strings?
Strings are tag names that consist of English letters, numbers, spaces and some additional charact
That depends on your hardware.
On modern hardware, i.e. Intel/AMD with SSE4.2 or arm7 you should use the internal _mm_crc32_uxx
intrinsics, as they are optimal for short strings. (For long keys also, but then better use Adler's threaded version, as in zlib)
On old or unknown hardware, either run-time probe for the SSE4.2 or CRC32 feature or just use one if the simple good hash functions. E.g. Murmur2 or City
An overview of quality and performance is here: https://github.com/rurban/smhasher#smhasher
There are also all the implementations. Favored are https://github.com/rurban/smhasher/blob/master/crc32_hw.c and https://github.com/rurban/smhasher/blob/master/MurmurHash2.cpp
If you know the keys in advance, use a perfect hash, not a hash function. E.g. gperf or my phash: https://github.com/rurban/Perfect-Hash#name
Nowadays perfect hash generation via a c compiler is so fast, you can even create them on the fly, and dynaload it.
I'm not sure if it's the best choice, but here is a hash function for strings:
The Practice of Programming (HASH TABLES, pg. 57)
/* hash: compute hash value of string */
unsigned int hash(char *str)
{
unsigned int h;
unsigned char *p;
h = 0;
for (p = (unsigned char*)str; *p != '\0'; p++)
h = MULTIPLIER * h + *p;
return h; // or, h % ARRAY_SIZE;
}
Empirically, the values 31 and 37 have proven to be good choices for the multiplier in a hash function for ASCII strings.