Algorithm for sampling without replacement?

前端 未结 6 1630
情歌与酒
情歌与酒 2020-12-02 13:56

I am trying to test the likelihood that a particular clustering of data has occurred by chance. A robust way to do this is Monte Carlo simulation, in which the associations

6条回答
  •  Happy的楠姐
    2020-12-02 14:14

    Here's some code for sampling without replacement based on Algorithm 3.4.2S of Knuth's book Seminumeric Algorithms.

    void SampleWithoutReplacement
    (
        int populationSize,    // size of set sampling from
        int sampleSize,        // size of each sample
        vector & samples  // output, zero-offset indicies to selected items
    )
    {
        // Use Knuth's variable names
        int& n = sampleSize;
        int& N = populationSize;
    
        int t = 0; // total input records dealt with
        int m = 0; // number of items selected so far
        double u;
    
        while (m < n)
        {
            u = GetUniform(); // call a uniform(0,1) random number generator
    
            if ( (N - t)*u >= n - m )
            {
                t++;
            }
            else
            {
                samples[m] = t;
                t++; m++;
            }
        }
    }
    

    There is a more efficient but more complex method by Jeffrey Scott Vitter in "An Efficient Algorithm for Sequential Random Sampling," ACM Transactions on Mathematical Software, 13(1), March 1987, 58-67.

提交回复
热议问题