I recently attended an interview where I was asked \"write a program to find 100 largest numbers out of an array of 1 billion numbers.\"
I was only able to give a br
My immediate reaction for this would be to use a heap, but there is way to use QuickSelect without keeping all of the input values on hand at any one time.
Create an array of size 200 and fill it up with the first 200 input values. Run QuickSelect and discard the low 100, leaving you with 100 free places. Read in the next 100 input values and run QuickSelect again. Continue until you have run though the entire input in batches of 100.
At the end you have the top 100 values. For N values you have run QuickSelect roughly N/100 times. Each Quickselect cost about 200 times some constant, so the total cost is 2N times some constant. This looks linear in the size of the input to me, regardless of the parameter size that I am hardwiring to be 100 in this explanation.