Using HashSet and Contains to return TRUE if one or many fields is in the hash

杀马特。学长 韩版系。学妹 提交于 2021-01-20 09:44:55

问题


I am wondering if it is possible to use a HashSet and make the method Contains to return true if one of the field is in the hash for a giving object.

This is an example of what I would like

static void Main(string[] args)
{
    HashSet<Product> hash = new HashSet<Product>();

    // Since the Id is the same, both products are considered to be the same even if the URI is not the same
    // The opposite is also true.  If the URI is the same, both products are considered to be the same even if the Id is not the same
    Product product1 = new Product("123", "www.test.com/123.html");
    Product product2 = new Product("123", "www.test.com/123.html?lang=en");

    hash.Add(product1);

    if (hash.Contains(product2))
    {
        // I want the method "Contains" to return TRUE because one of the field is in the hash
    }
}

Here is the definition of the class Product

public class Product
{
    public string WebId
    public string Uri

    public Product(string Id, string uri)
    {
        WebId = Id;
        Uri = uri;
    }

    public override bool Equals(object obj)
    {
        if (ReferenceEquals(null, obj)) return false;
        if (ReferenceEquals(this, obj)) return true;
        if (obj.GetType() != typeof(Product)) return false;
        return Equals((Product)obj);
    }

    public bool Equals(Product obj)
    {
        if (ReferenceEquals(null, obj)) return false;
        if (ReferenceEquals(this, obj)) return true;

        if (String.Equals(WebId, obj.WebId) || String.Equals(Uri, obj.Uri)) 
            return true;
        else
            return false;
    }

    public override int GetHashCode()
    {
        unchecked
        {
            int hash = 17;

            hash = hash * 23 + WebId.GetHashCode();
            hash = hash * 23 + Uri.GetHashCode();
            return hash;
        }
    }
}

When I run my program, the method Contains only runs GetHashCode and never the method Equals. Hence, the method Contains return FALSE.

How can I make my HashSet to return TRUE for the example above ? Should I be using a Dictionary instead and add each fields to the dictionary ?


回答1:


Your GetHashCode() implementation isn't guaranteed to return the same value for two objects that are equal. Since you only require a match on, say, WebId. The Uri then screws up the hash code. Or the other way around. You cannot fix this, other than by returning 0. That's going to kill the HashSet<> perf, lookup will be O(n) instead of O(1).




回答2:


In a recent project we had the same problem, where the class's Equals() implementation was logical ORing properties to determine equality. To do a quick Contains() we built a number of IEqualityComparer with each one checking ONE property. You need one for each property that is ORed in your equality check.

    class WebIdComparer : IEqualityComparer<Product>
    {
        public bool Equals(Product x, Product y)
        {
            return Equals(x.WebId, y.WebId);
        }

        public int GetHashCode(Product obj)
        {
            unchecked
            {
                return obj.WebId.GetHashCode();
            }
        }
    }

    class UriComparer : IEqualityComparer<Product>
    {
        public bool Equals(Product x, Product y)
        {
            return Equals(x.Uri, y.Uri);
        }

        public int GetHashCode(Product obj)
        {
            unchecked
            {
                return obj.Uri.GetHashCode();
            }
        }
    }

Then, create one hashtable per IEqualityComparer, passing in the comparer to the constructor. insert your collection into each hashtable, then for each item you want to test, do a contains() on each hashtable and OR the result. So For example:

var uriHashTable = new HashSet<Product>(existingProducts, new UriComparer());
var webIdHashTable = new HashSet<Product>(existingProducts, new WebIdComparer());

foreach (var newProduct in newProducts)
{
    if (uriHashTable.Contains(newProduct) || webIdHashTable.Contains(newProduct))
        //then it is equal to an existing product according to your equals implementation
}

Obviously this method suffers from using quite a bit more memory than the IEnumerable.Contains() method, needs more memory for every property that is ORed in your equals implementation.




回答3:


Does it fit in your program design to use a lamba inside the Contains method call? It is the most straightforward way I can think of to achieve what you want.

if (hash.Contains(p => p.WedId == product2.WebId))
{
    // "Contains" will now return TRUE because the WebId matches
}


来源:https://stackoverflow.com/questions/5176116/using-hashset-and-contains-to-return-true-if-one-or-many-fields-is-in-the-hash

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!