Limit input data to achieve a better Big O complexity

淺唱寂寞╮ 提交于 2019-11-29 18:16:22

You can do linearly (O(n)) for any input if you use hash tables (which have constant look-up time).

However, this is not what you are being asked about.

By limiting the possible values in the array, you can achieve linear performance.

E.g., if your integers have range 1..L, you can allocate a bit array of length L, initialize it to 0, and iterate over your input array, checking and flipping the appropriate bit for each input.

A variance of Bucket Sort will do. This will give you complexity of O(n) where 'n' is the number of input elements.

But one restriction - max value. You should know the max value your integer array can take. Lets say it as m.

The idea is to create a bool array of size m (all initialized to false). Then iterate over your array. As you find an element, set bucket[m] to true. If it is already true then you've encountered a duplicate.

A java code,



// alternatively, you can iterate over the array to find the maxVal which again is O(n).
public boolean findDup(int [] arr, int maxVal)
{
        // java by default assigns false to all the values.
    boolean bucket[] = new boolean[maxVal];

    for (int elem : arr)
    {

        if (bucket[elem])
        {
           return true; // a duplicate found
        }

        bucket[elem] = true;
    }   
    return false;   
}

But the constraint here is the space. You need O(maxVal) space.

nested loops get you O(N*M) or O(N*log(M)) for O(N) you can not use nested loops !!!

I would do it by use of histogram instead:

DWORD in[N]={ ... }; // input data ... values are from < 0 , M )
DWORD his[M]={ ... }; // histogram of in[]
int i,j;

// compute histogram O(N)
for (i=0;i<M;i++) his[i]=0;     // this can be done also by memset ...
for (i=0;i<N;i++) his[in[i]]++; // if the range of values is not from 0 then shift it ...

// remove duplicates O(N)
for (i=0,j=0;i<N;i++)
 {
 his[in[i]]--;      // count down duplicates
 in[j]=in[i];       // copy item
 if (his[in[i]]<=0) j++; // if not duplicate then do not delete it
 }
// now j holds the new in[] array size

[Notes]

  • if value range is too big with sparse areas then you need to convert his[]
  • to dynamic list with two values per item
  • one is the value from in[] and the second is its occurrence count
  • but then you need nested loop -> O(N*M)
  • or with binary search -> O(N*log(M))
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!