removing duplicates from a list C#

后端 未结 3 1261
予麋鹿
予麋鹿 2020-12-15 13:09

I am following a previous post on stackoverflow about removing duplicates from a List in C#.

If is some user defined type like:



        
相关标签:
3条回答
  • 2020-12-15 13:20

    For this task I don't necessarily thinks implementing IComparable is the obvious solution. You might want to sort and test for uniqueness in many different ways.

    I would favor implementing a IEqualityComparer<Contact>:

    sealed class ContactFirstNameLastNameComparer : IEqualityComparer<Contact>
    {
      public bool Equals (Contact x, Contact y)
      {
         return x.firstname == y.firstname && x.lastname == y.lastname;
      }
    
      public int GetHashCode (Contact obj)
      {
         return obj.firstname.GetHashCode () ^ obj.lastname.GetHashCode ();
      }
    }
    

    And then use System.Linq.Enumerable.Distinct (assuming you are using at least .NET 3.5)

    var unique = contacts.Distinct (new ContactFirstNameLastNameComparer ()).ToArray ();
    

    PS. Speaking of HashSet<> Note that HashSet<> takes an IEqualityComparer<> as a constructor parameter.

    0 讨论(0)
  • 2020-12-15 13:22

    A HashSet<T> does remove duplicates, because it's a set... but only when your type defines equality appropriately.

    I suspect by "duplicate" you mean "an object with equal field values to another object" - you need to override Equals/GetHashCode for that to work, and/or implement IEquatable<Contact>... or you could provide an IEqualityComparer<Contact> to the HashSet<T> constructor.

    Instead of using a HashSet<T> you could just call the Distinct LINQ extension method. For example:

    list = list.Distinct().ToList();
    

    But again, you'll need to provide an appropriate definition of equality, somehow or other.

    Here's a sample implementation. Note how I've made it immutable (equality is odd with mutable types, because two objects can be equal one minute and non-equal the next) and made the fields private, with public properties. Finally, I've sealed the class - immutable types should generally be sealed, and it makes equality easier to talk about.

    using System;
    using System.Collections.Generic; 
    
    public sealed class Contact : IEquatable<Contact>
    {
        private readonly string firstName;
        public string FirstName { get { return firstName; } }
    
        private readonly string lastName;
        public string LastName { get { return lastName; } }
    
        private readonly string phoneNumber;
        public string PhoneNumber { get { return phoneNumber; } }
    
        public Contact(string firstName, string lastName, string phoneNumber)
        {
            this.firstName = firstName;
            this.lastName = lastName;
            this.phoneNumber = phoneNumber;
        }
    
        public override bool Equals(object other)
        {
            return Equals(other as Contact);
        }
    
        public bool Equals(Contact other)
        {
            if (object.ReferenceEquals(other, null))
            {
                return false;
            }
            if (object.ReferenceEquals(other, this))
            {
                return true;
            }
            return FirstName == other.FirstName &&
                   LastName == other.LastName &&
                   PhoneNumber == other.PhoneNumber;
        }
    
        public override int GetHashCode()
        {
            // Note: *not* StringComparer; EqualityComparer<T>
            // copes with null; StringComparer doesn't.
            var comparer = EqualityComparer<string>.Default;
    
            // Unchecked to allow overflow, which is fine
            unchecked
            {
                int hash = 17;
                hash = hash * 31 + comparer.GetHashCode(FirstName);
                hash = hash * 31 + comparer.GetHashCode(LastName);
                hash = hash * 31 + comparer.GetHashCode(PhoneNumber);
                return hash;
            }
        }
    }
    

    EDIT: Okay, in response to requests for an explanation of the GetHashCode() implementation:

    • We want to combine the hash codes of the properties of this object
    • We're not checking for nullity anywhere, so we should assume that some of them may be null. EqualityComparer<T>.Default always handles this, which is nice... so I'm using that to get a hash code of each field.
    • The "add and multiply" approach to combining several hash codes into one is the standard one recommended by Josh Bloch. There are plenty of other general-purpose hashing algorithms, but this one works fine for most applications.
    • I don't know whether you're compiling in a checked context by default, so I've put the computation in an unchecked context. We really don't care if the repeated multiply/add leads to an overflow, because we're not looking for a "magnitude" as such... just a number that we can reach repeatedly for equal objects.

    Two alternative ways of handling nullity, by the way:

    public override int GetHashCode()
    {
        // Unchecked to allow overflow, which is fine
        unchecked
        {
            int hash = 17;
            hash = hash * 31 + (FirstName ?? "").GetHashCode();
            hash = hash * 31 + (LastName ?? "").GetHashCode();
            hash = hash * 31 + (PhoneNumber ?? "").GetHashCode();
            return hash;
        }
    }
    

    or

    public override int GetHashCode()
    {
        // Unchecked to allow overflow, which is fine
        unchecked
        {
            int hash = 17;
            hash = hash * 31 + (FirstName == null ? 0 : FirstName.GetHashCode());
            hash = hash * 31 + (LastName == null ? 0 : LastName.GetHashCode());
            hash = hash * 31 + (PhoneNumber == null ? 0 : PhoneNumber.GetHashCode());
            return hash;
        }
    }
    
    0 讨论(0)
  • 2020-12-15 13:36
    class Contact {
        public int Id { get; set; }
        public string Name { get; set; }
    
        public override string ToString()
        {
            return string.Format("{0}:{1}", Id, Name);
        }
    
        static private IEqualityComparer<Contact> comparer;
        static public IEqualityComparer<Contact> Comparer {
            get { return comparer ?? (comparer = new EqualityComparer()); }
        }
    
        class EqualityComparer : IEqualityComparer<Contact> {
            bool IEqualityComparer<Contact>.Equals(Contact x, Contact y)
            {
                if (x == y) 
                    return true;
    
                if (x == null || y == null)
                    return false;
    
                return x.Name == y.Name; // let's compare by Name
            }
    
            int IEqualityComparer<Contact>.GetHashCode(Contact c)
            {
                return c.Name.GetHashCode(); // let's compare by Name
            }
        }
    }
    
    class Program {
        public static void Main()
        {
            var list = new List<Contact> {
                new Contact { Id = 1, Name = "John" },
                new Contact { Id = 2, Name = "Sylvia" },
                new Contact { Id = 3, Name = "John" }
            };
    
            var distinctNames = list.Distinct(Contact.Comparer).ToList();
            foreach (var contact in distinctNames)
                Console.WriteLine(contact);
        }
    }
    

    gives

    1:John
    2:Sylvia
    
    0 讨论(0)
提交回复
热议问题