GetHashCode() problem using xor

≡放荡痞女 提交于 2019-12-21 04:11:24

问题


My understanding is that you're typically supposed to use xor with GetHashCode() to produce an int to identify your data by its value (as opposed to by its reference). Here's a simple example:

class Foo
{
    int m_a;
    int m_b;

    public int A
    {
        get { return m_a; }
        set { m_a = value; }
    }

    public int B
    {
        get { return m_b; }
        set { m_b = value; }
    }

    public Foo(int a, int b)
    {
        m_a = a;
        m_b = b;
    }

    public override int GetHashCode()
    {
        return A ^ B;
    }

    public override bool Equals(object obj)
    {
        return this.GetHashCode() == obj.GetHashCode();
    }
}

The idea being, I want to compare one instance of Foo to another based on the value of properties A and B. If Foo1.A == Foo2.A and Foo1.B == Foo2.B, then we have equality.

Here's the problem:

Foo one = new Foo(1, 2);
Foo two = new Foo(2, 1);

if (one.Equals(two)) { ... }  // This is true!

These both produce a value of 3 for GetHashCode(), causing Equals() to return true. Obviously, this is a trivial example, and with only two properties I could simply compare the individual properties in the Equals() method. However, with a more complex class this would get out of hand quickly.

I know that sometimes it makes good sense to set the hash code only once, and always return the same value. However, for mutable objects where an evaluation of equality is necessary, I don't think this is reasonable.

What's the best way to handle property values that could easily be interchanged when implementing GetHashCode()?

See Also

What is the best algorithm for an overridden System.Object.GetHashCode?


回答1:


First off - Do not implement Equals() only in terms of GetHashCode() - hashcodes will sometimes collide even when objects are not equal.

The contract for GetHashCode() includes the following:

  • different hashcodes means that objects are definitely not equal
  • same hashcodes means objects might be equal (but possibly might not)

Andrew Hare suggested I incorporate his answer:

I would recommend that you read this solution (by our very own Jon Skeet, by the way) for a "better" way to calculate a hashcode.

No, the above is relatively slow and doesn't help a lot. Some people use XOR (eg a ^ b ^ c) but I prefer the kind of method shown in Josh Bloch's "Effective Java":

public override int GetHashCode()
{
    int hash = 23;
    hash = hash*37 + craneCounterweightID;
    hash = hash*37 + trailerID;
    hash = hash*37 + craneConfigurationTypeCode.GetHashCode();
    return hash;
}

The 23 and 37 are arbitrary numbers which are co-prime.

The benefit of the above over the XOR method is that if you have a type which has two values which are frequently the same, XORing those values will always give the same result (0) whereas the above will differentiate between them unless you're very unlucky.

As mentioned in the above snippet, you might also want to look at Joshua Bloch's book, Effective Java, which contains a nice treatment of the subject (the hashcode discussion applies to .NET as well).




回答2:


Andrew has posted a good example for generating a better hash code, but also bear in mind that you shouldn't use hash codes as an equality check, since they are not guaranteed to be unique.

For a trivial example of why this is consider a double object. It has more possible values than an int so it is impossible to have a unique int for each double. Hashes are really just a first pass, used in situations like a dictionary when you need to find the key quickly, by first comparing hashes a large percentage of the possible keys can be ruled out and only the keys with matching hashes need to have the expense of a full equality check (or other collision resolution methods).




回答3:


Hashing always involves collisions and you have to deal with it (f.e., compare hash values and if they are equal, exactly compare the values inside the classes to be sure the classes are equal).

Using a simple XOR, you'll get many collisions. If you want less, use some mathematical functions that distribute values across different bits (bit shifts, multiplying with primes etc.).




回答4:


Read Overriding GetHashCode for mutable objects? C# and think about implementing IEquatable<T>




回答5:


A quick generate and good distribution of hash

public override int GetHashCode()
{
    return A.GetHashCode() ^ B.GetHashCode();         // XOR
}



回答6:


Out of curiosity since hashcodes are typically a bad idea for comparison, wouldn't it be better to just do the following code, or am I missing something?

public override bool Equals(object obj)
{
    bool isEqual = false;
    Foo otherFoo = obj as Foo;
    if (otherFoo != null)
    {
        isEqual = (this.A == otherFoo.A) && (this.B == otherFoo.B);
    }
    return isEqual;
}



回答7:


There are several better hash implementations. FNV hash for example.



来源:https://stackoverflow.com/questions/1008633/gethashcode-problem-using-xor

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!