More elegant way to check for duplicates in C++ array?

后端未结

关注

 12  1413

I wrote this code in C++ as part of a uni task where I need to ensure that there are no duplicates within an array:

// Check for duplicate numbers in user in


                      
              相关标签:


      
      
        
          12条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  醉话见心        
                
              
                            
                2020-12-01 10:01
              
            
            
                                                                       
fast   O(N) time and space  solution
return first when it hits duplicate
template <typename T>
bool containsDuplicate(vector<T>& items) {
    return any_of(items.begin(), items.end(), [s = unordered_set<T>{}](const auto& item) mutable {
        return !s.insert(item).second;
    });
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  傲寒        
                
              
                            
                2020-12-01 10:03
              
            
            
                                                                       
As mentioned by @underscore_d, an elegant and efficient solution would be,

#include <algorithm>
#include <vector>

template <class Iterator>
bool has_duplicates(Iterator begin, Iterator end) {
    using T = typename std::iterator_traits<Iterator>::value_type;
    std::vector<T> values(begin, end);

    std::sort(values.begin(), values.end());
    return (std::adjacent_find(values.begin(), values.end()) != values.end());
}

int main() {
    int user_ids[6];
    // ...
    std::cout << has_duplicates(user_ids, user_ids + 6) << std::endl;
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  再見小時候        
                
              
                            
                2020-12-01 10:05
              
            
            
                                                                       
Indeed, the fastest and as far I can see most elegant method is as advised above:

std::vector<int> tUserNumbers;
// ...
std::set<int> tSet(tUserNumbers.begin(), tUserNumbers.end());
std::vector<int>(tSet.begin(), tSet.end()).swap(tUserNumbers);


It is O(n log n). This however does not make it, if the ordering of the numbers in the input array needs to be kept... In this case I did:

    std::set<int> tTmp;
    std::vector<int>::iterator tNewEnd = 
        std::remove_if(tUserNumbers.begin(), tUserNumbers.end(), 
        [&tTmp] (int pNumber) -> bool {
            return (!tTmp.insert(pNumber).second);
    });
    tUserNumbers.erase(tNewEnd, tUserNumbers.end());


which is still O(n log n) and keeps the original ordering of elements in tUserNumbers. 

Cheers,

Paul
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  梦谈多话        
                
              
                            
                2020-12-01 10:05
              
            
            
                                                                       
You can add all elements in a set and check when adding if it is already present or not. That would be more elegant and efficient. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  忘掉有多难        
                
              
                            
                2020-12-01 10:07
              
            
            
                                                                       
It is in extension to the answer by @Puppy, which is the current best answer.

PS : I tried to insert this post as comment in the current best answer by @Puppy but couldn't so as I don't have 50 points yet. Also a bit of experimental data is shared here for further help.  

Both std::set and std::map are implemented in STL using Balanced Binary Search tree only. So both will lead to a complexity of O(nlogn) only in this case. While the better performance can be achieved if a hash table is used. std::unordered_map offers hash table based implementation for faster search. I experimented with all three implementations and found the results using std::unordered_map to be better than std::set and std::map. Results and code are shared below. Images are the snapshot of performance measured by LeetCode on the solutions.



bool hasDuplicate(vector<int>& nums) {
    size_t count = nums.size();
    if (!count)
        return false;
    std::unordered_map<int, int> tbl;
    //std::set<int> tbl;
    for (size_t i = 0; i < count; i++) {
        if (tbl.find(nums[i]) != tbl.end())
            return true;
        tbl[nums[i]] = 1;
        //tbl.insert(nums[i]);
    }
    return false;
}



unordered_map Performance (Run time was 52 ms here)


Set/Map Performance

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  粉色の甜心        
                
              
                            
                2020-12-01 10:07
              
            
            
                                                                       
I'm not sure why this hasn't been suggested but here is a way in base 10 to find duplicates in O(n).. The problem I see with the already suggested O(n) solution is that it requires that the digits be sorted first.. This method is O(n) and does not require the set to be sorted. The cool thing is that checking if a specific digit has duplicates is O(1). I know this thread is probably dead but maybe it will help somebody! :)

/*
============================
Foo
============================
* 
   Takes in a read only unsigned int. A table is created to store counters 
   for each digit. If any digit's counter is flipped higher than 1, function
   returns. For example, with 48778584:
    0   1   2   3   4   5   6   7   8   9
   [0] [0] [0] [0] [2] [1] [0] [2] [2] [0]

   When we iterate over this array, we find that 4 is duplicated and immediately
   return false.

*/
bool Foo( unsigned const int &number)
{
    int temp = number;
    int digitTable[10]={0};

    while(temp > 0)
    {
        digitTable[temp % 10]++; // Last digit's respective index.
        temp /= 10; // Move to next digit
    }

    for (int i=0; i < 10; i++)
    {
        if (digitTable [i] > 1)
        {
            return false;
        }
    }
    return true;
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复