This question on getting random values from a finite set got me thinking...
It\'s fairly common for people to want to retrieve X unique values from a set of Y values.
There's a beautiful O(n) algorithm for this. It goes as follows. Say you have n items, from which you want to pick m items. I assume the function rand() yields a random real number between 0 and 1. Here's the algorithm:
items_left=n
items_left_to_pick=m
for j=1,...,n
if rand()<=(items_left_to_pick/items_left)
Pick item j
items_left_to_pick=items_left_to_pick-1
end
items_left=items_left-1
end
It can be proved that this algorithm does indeed pick each subset of m items with equal probability, though the proof is non-obvious. Unfortunately, I don't have a reference handy at the moment.
Edit The advantage of this algorithm is that it takes only O(m) memory (assuming the items are simply integers or can be generated on-the-fly) compared to doing a shuffle, which takes O(n) memory.