Pick a unique random subset from a set of unique values

前端 未结 3 640
慢半拍i
慢半拍i 2020-12-10 03:42

C++. Visual Studio 2010.

I have a std::vector V of N unique elements (heavy structs). How can efficiently pick M random, unique, elemen

3条回答
  •  情歌与酒
    2020-12-10 04:27

    Since you wanted it to be efficient, I think you can get an amortised O(M), assuming you have to perform that operation a lot of times. However, this approach is not reentrant.

    First of all create a local (i.e. static) vector of std::vector<...>::size_type (i.e. unsigned will do) values.

    If you enter your function, resize the vector to match N and fill it with values from the old size to N-1:

    static std::vector indices;
    if (indices.size() < N) {
      indices.reserve(N);
      for (unsigned i = indices.size(); i < N; i++) {
        indices.push_back(i);
      }
    }
    

    Then, randomly pick M unique numbers from that vector:

    std::vector result;
    result.reserver(M);
    for (unsigned i = 0; i < M; i++) {
      unsigned const r = getRandomNumber(0,N-i); // random number < N-i
      result.push_back(indices[r]);
      indices[r] = indices[N-i-1];
      indices[N-i-1] = r;
    }
    

    Now, your result is sitting in the result vector.

    However, you still have to repair your changes to indices for the next run, so that indices is monotonic again:

    for (unsigned i = N-M; i < N; i++) {
      // restore previously changed values
      indices[indices[i]] = indices[i];
      indices[i] = i;
    }
    

    But this approach is only useful, if you have to run that algorithm a lot and N doesn't grow so big that you cannot live with indices eating up RAM all the the time.

提交回复
热议问题