Say I have y
distinct values and I want to select x
of them at random. What\'s an efficient algorithm for doing this? I could just call rand(
Here is a simple way to do it which is only inefficient if Y
is much larger than X
.
void randomly_select_subset(
int X, int Y,
const int * inputs, int X, int * outputs
) {
int i, r;
for( i = 0; i < X; ++i ) outputs[i] = inputs[i];
for( i = X; i < Y; ++i ) {
r = rand_inclusive( 0, i+1 );
if( r < i ) outputs[r] = inputs[i];
}
}
Basically, copy the first X
of your distinct values to your output array, and then for each remaining value, randomly decide whether or not to include that value.
The random number is further used to choose an element of our (mutable) output array to replace.