You are given an unsorted array of n integers, and you would like to find if there are any duplicates in the array (i.e. any integer appearing more than once). D
A variance of Bucket Sort will do. This will give you complexity of O(n)
where 'n' is the number of input elements.
But one restriction - max value. You should know the max value your integer array can take. Lets say it as m
.
The idea is to create a bool array of size m
(all initialized to false). Then iterate over your array. As you find an element, set bucket[m]
to true
. If it is already true
then you've encountered a duplicate.
A java code,
// alternatively, you can iterate over the array to find the maxVal which again is O(n).
public boolean findDup(int [] arr, int maxVal)
{
// java by default assigns false to all the values.
boolean bucket[] = new boolean[maxVal];
for (int elem : arr)
{
if (bucket[elem])
{
return true; // a duplicate found
}
bucket[elem] = true;
}
return false;
}
But the constraint here is the space. You need O(maxVal)
space.
You can do linearly (O(n)
) for any input if you use hash tables (which have constant look-up time).
However, this is not what you are being asked about.
By limiting the possible values in the array, you can achieve linear performance.
E.g., if your integers have range 1..L
, you can allocate a bit array of length L
, initialize it to 0, and iterate over your input array, checking and flipping the appropriate bit for each input.
nested loops get you O(N*M) or O(N*log(M)) for O(N) you can not use nested loops !!!
I would do it by use of histogram instead:
DWORD in[N]={ ... }; // input data ... values are from < 0 , M )
DWORD his[M]={ ... }; // histogram of in[]
int i,j;
// compute histogram O(N)
for (i=0;i<M;i++) his[i]=0; // this can be done also by memset ...
for (i=0;i<N;i++) his[in[i]]++; // if the range of values is not from 0 then shift it ...
// remove duplicates O(N)
for (i=0,j=0;i<N;i++)
{
his[in[i]]--; // count down duplicates
in[j]=in[i]; // copy item
if (his[in[i]]<=0) j++; // if not duplicate then do not delete it
}
// now j holds the new in[] array size
[Notes]