Limit input data to achieve a better Big O complexity

前端 未结 3 1939
傲寒
傲寒 2020-12-22 14:40

You are given an unsorted array of n integers, and you would like to find if there are any duplicates in the array (i.e. any integer appearing more than once). D

相关标签:
3条回答
  • 2020-12-22 14:54

    A variance of Bucket Sort will do. This will give you complexity of O(n) where 'n' is the number of input elements.

    But one restriction - max value. You should know the max value your integer array can take. Lets say it as m.

    The idea is to create a bool array of size m (all initialized to false). Then iterate over your array. As you find an element, set bucket[m] to true. If it is already true then you've encountered a duplicate.

    A java code,

    
    
    // alternatively, you can iterate over the array to find the maxVal which again is O(n).
    public boolean findDup(int [] arr, int maxVal)
    {
            // java by default assigns false to all the values.
        boolean bucket[] = new boolean[maxVal];
    
        for (int elem : arr)
        {
    
            if (bucket[elem])
            {
               return true; // a duplicate found
            }
    
            bucket[elem] = true;
        }   
        return false;   
    }
    
    

    But the constraint here is the space. You need O(maxVal) space.

    0 讨论(0)
  • 2020-12-22 15:08

    You can do linearly (O(n)) for any input if you use hash tables (which have constant look-up time).

    However, this is not what you are being asked about.

    By limiting the possible values in the array, you can achieve linear performance.

    E.g., if your integers have range 1..L, you can allocate a bit array of length L, initialize it to 0, and iterate over your input array, checking and flipping the appropriate bit for each input.

    0 讨论(0)
  • 2020-12-22 15:17

    nested loops get you O(N*M) or O(N*log(M)) for O(N) you can not use nested loops !!!

    I would do it by use of histogram instead:

    DWORD in[N]={ ... }; // input data ... values are from < 0 , M )
    DWORD his[M]={ ... }; // histogram of in[]
    int i,j;
    
    // compute histogram O(N)
    for (i=0;i<M;i++) his[i]=0;     // this can be done also by memset ...
    for (i=0;i<N;i++) his[in[i]]++; // if the range of values is not from 0 then shift it ...
    
    // remove duplicates O(N)
    for (i=0,j=0;i<N;i++)
     {
     his[in[i]]--;      // count down duplicates
     in[j]=in[i];       // copy item
     if (his[in[i]]<=0) j++; // if not duplicate then do not delete it
     }
    // now j holds the new in[] array size
    

    [Notes]

    • if value range is too big with sparse areas then you need to convert his[]
    • to dynamic list with two values per item
    • one is the value from in[] and the second is its occurrence count
    • but then you need nested loop -> O(N*M)
    • or with binary search -> O(N*log(M))
    0 讨论(0)
提交回复
热议问题