I would like to write a function GetHashCodeOfList() which returns a hash-code of a list of strings regardless of order. Given 2 lists with the same strings sho
Here is a hybrid approach. It combines the three commutative operations (XOR, addition and multiplication), applying each one in different ranges of the 32bit number. The bit-range of each operation is adjustable.
public static int GetOrderIndependentHashCode(IEnumerable source)
{
var comparer = EqualityComparer.Default;
const int XOR_BITS = 10;
const int ADD_BITS = 11;
const int MUL_BITS = 11;
Debug.Assert(XOR_BITS + ADD_BITS + MUL_BITS == 32);
int xor_total = 0;
int add_total = 0;
int mul_total = 17;
unchecked
{
foreach (T element in source)
{
var hashcode = comparer.GetHashCode(element);
int xor_part = hashcode >> (32 - XOR_BITS);
int add_part = hashcode << XOR_BITS >> (32 - ADD_BITS);
int mul_part = hashcode << (32 - MUL_BITS) >> (32 - MUL_BITS);
xor_total = xor_total ^ xor_part;
add_total = add_total + add_part;
if (mul_part != 0) mul_total = mul_total * mul_part;
}
xor_total = xor_total % (1 << XOR_BITS); // Compact
add_total = add_total % (1 << ADD_BITS); // Compact
mul_total = mul_total - 17; // Subtract initial value
mul_total = mul_total % (1 << MUL_BITS); // Compact
int result = (xor_total << (32 - XOR_BITS)) + (add_total << XOR_BITS) + mul_total;
return result;
}
}
The performance is almost identical with the simple XOR method, because the call to GetHashCode of each element dominates the CPU demand.