I recently attended an interview where I was asked \"write a program to find 100 largest numbers out of an array of 1 billion numbers.\"
I was only able to give a br
Although in this question we should search for top 100 numbers, I will
generalize things and write x. Still, I will treat x as constant value.
Algorithm Biggest x elements from n:
I will call return value LIST. It is a set of x elements (in my opinion that should be linked list)
So, what is the worst case scenario?
x log(x) + (n-x)(log(x)+1) = nlog(x) + n - x
So that is O(n) time for worst case. The +1 is the checking if number is greater than smallest one in LIST. Expected time for average case will depend on mathematical distribution of those n elements.
Possible improvements
This algorithm can be slightly improved for worst case scenario but IMHO (I can not prove this claim) that will degrade average behavior. Asymptotic behavior will be the same.
Improvement in this algorithm will be that we will not check if element is greater than smallest. For each element we will try to insert it and if it is smaller than smallest we will disregard it. Although that sounds preposterous if we regard only the worst case scenario we will have
x log(x) + (n-x)log(x) = nlog(x)
operations.
For this use case I don't see any further improvements. Yet you must ask yourself - what if I have to do this more than log(n) times and for different x-es? Obviously we would sort that array in O(n log(n)) and take our x element whenever we need them.