What kind of hash algorithm is used for Hive's built-in HASH() Function
What kind of hashing algorithm is used in the built-in HASH() function? I'm ideally looking for a SHA512/SHA256 hash, similar to what the SHA() function offers within the linkedin datafu UDFs for Pig. HASH function (as of Hive 0.11) uses algorithm similar to java.util.List#hashCode . Its code looks like this: int hashCode = 0; // Hive HASH uses 0 as the seed, List#hashCode uses 1. I don't know why. for (Object item: items) { hashCode = hashCode * 31 + (item == null ? 0 : item.hashCode()); } Basically it's a classic hash algorithm as recommended in the book Effective Java. To quote a great man