I\'m working on porting a MATLAB simulation into C++. To do this, I am trying to replicate MATLAB\'s randsample() function. I haven\'t figured out an efficient way to do thi
Bob Floyd created a random sample algorithm that uses sets. The intermediate structure size is proportional to the sample size you want to take.
It works by randomly generating K numbers and adding them to a set. If a generated number happens to already exist in the set, it places the value of a counter instead which is guaranteed to have not been seen yet. Thus it is guaranteed to run in linear time and does not require a large intermediate structure. It still has pretty good random distribution properties.
This code is basically lifted from Programming Pearls with some modifications to use more modern C++.
unordered_set BobFloydAlgo(int sampleSize, int rangeUpperBound)
{
unordered_set sample;
default_random_engine generator;
for(int d = rangeUpperBound - sampleSize; d < rangeUpperBound; d++)
{
int t = uniform_int_distribution<>(0, d)(generator);
if (sample.find(t) == sample.end() )
sample.insert(t);
else
sample.insert(d);
}
return sample;
}
This code has not been tested.