I\'m trying to choose a hash algorithm for comparing about max 20 different text data.
Which hash is better for these requirements?
I had the same request for myselve and i implemented xxHashSharp . Just make sure you take the appropriate library ( x32 vs x64). It's also available outside of c# here
The FNV hash is a well-known fast hashing algorithm. It is not cryptographically secure, but it sounds like you don't need a secure hash.
If collision is not a big deal you can take the first letter of each document. Or you can use the length of the text or the string with the text.
Check out the serie Peter Karkowski published on his blog.
Paul Hsieh has a decent, simple, fast, 32-bit SuperFastHash that performs better than most existing hash functions, is easier to understand/implement, and sounds like it meets your criteria.
If you are constrained to algorithms that exist in the framework
Is MD5 small enough (16 bytes)?
Less CPU Consumption and Small footprint are usually mutually exclusive.
http://en.wikipedia.org/wiki/Time-space_tradeoff