Most efficient way to remove duplicates from a List

前端未结

关注

 1  829

深忆病人 2020-12-01 18:13

Let\'s say I have a List with duplicate values and I want to remove the duplicates.

List myList = new List(Enumerable.Range(0, 10000));


      
      
        
          1条回答        

        
                    
            
            
                         
                
              
              
                
                   执笔经年
                                             
                
                
                (楼主)
            
              
              
                2020-12-01 18:36
              

            
            
                        
There is a big difference between these two approaches:

List Result1 = new HashSet(myList).ToList(); //3700 ticks
List Result2 = myList.Distinct().ToList(); //4700 ticks


The first one can (will probably) change the order of the elements of the returned List<>: Result1 elements won't be in the same order of myList's ones. The second maintains the original ordering.

There is probably no faster way than the first one. 

There is probably no "more correct" (for a certain definition of "correct" based on ordering) than the second one.

(the third one is similar to the second one, only slower)

Just out of curiousity, the Distinct() is:

// Reference source http://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,712
public static IEnumerable Distinct(this IEnumerable source) {
    if (source == null) throw Error.ArgumentNull("source");
    return DistinctIterator(source, null);
}

// Reference source http://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,722
static IEnumerable DistinctIterator(IEnumerable source, IEqualityComparer comparer) {
    Set set = new Set(comparer);
    foreach (TSource element in source)
        if (set.Add(element)) yield return element;
}


So in the end the Distinct() simply uses an internal implementation of an HashSet<> (called Set<>) to check for the uniqueness of items.

For completeness sake, I'll add a link to the question Does C# Distinct() method keep original ordering of sequence intact?
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                    
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复