Concurrent random number generation

问题

I'm writing a parallel program using open mp in which I generate a matrix of random floating point numbers and then do a number of calculations on it. I currently want to make the step where I generate the matrix run in parallel, but I have the problem that the rand() function was not meant to run concurrently. I don't want to use locks to provide mutex on rand because this is the only thing being done in the loop and it would probably just be more efficient to run it sequentially. Is there any way to do this step efficiently in parallel?

Here if the current code for this part (with out mutex on rand);

#pragma omp parallel default(private)
{
    int i= omp_get_thread_num();
    for(int j=0; j<cols; j++)
        matrix[i][j]= rand()%1000 + (float)(rand()%100)/(float)(rand()%1000);
}

回答1:

If you're using C++, you should consider using the Boost library random number classes. You can create a unique PRNG instance for each thread. If you require repeatability, you can initialize each instance in your main thread with repeatably generated seed values.

UPDATE: Turns out after I wrote this, C++11 was released and it included a more modern library for generating random numbers. It includes std::uniform_int_distribution and std::std::uniform_real_distribution, both of which rely upon a generator such as std::mersenne_twister_engine (or a specific configuration std::mt19937). As an example:

#include <random>
#include <iostream>

int main() {
    std::mt19937 gen;  // Uses default seed value to generate repeatable sequence
    std::uniform_int_distribution<int> d20(1,20);
    std::uniform_real_distribution<float> prob(0.0, 1.0);

    std::cout << d20(gen) << std::endl;
    std::cout << prob(gen) << std::endl;

    return 0;
}

回答2:

I think you're looking for rand_r(), which explicitly takes the current RNG state as a parameter. Then each thread should have it's own copy of seed data (whether you want each thread to start off with the same seed or different ones depends on what you're doing, here you want them to be different or you'd get the same row again and again).

There's some discussion of rand_r() and thread-safety here: whether rand_r is real thread safe? .

So say you wanted each thread to have its seed start off with its thread number (which is probably not what you want, as it would give the same matrix every time you ran with the same number of threads, but just as an example):

#pragma omp parallel default(none) shared(matrix, cols)
{
    int i= omp_get_thread_num();
    unsigned int myseed = i;
    for(int j=0; j<cols; j++)
        matrix[i][j]= rand_r(&myseed)%1000 + (float)(rand_r(&myseed)%100)/(float)(rand_r(&myseed)%1000 + 1);
}

Now each thread is modifying its own state exclusively (rand_r() is a pure function) and you should be home free.

回答3:

If pseudorandom is good enough (cf. Ben's comment), then you could create your own PRNG (eg. A Mersenne Twister rather than the weak modulo method used by most systems), and implement one independent generator per thread. if you do this, you MUST make sure that each generator has a different seed.

回答4:

A real problem is if you want reproducibility, which is often needed in test. With a given seed generate a sequence of thread seeds. Then each thread will use its own seed to generate numbers.

The fact that rand() is not thread safe is hardly an issue. There are plenty of algorithms available and is trivial to roll one instance (state) per thread, just start from http://en.wikipedia.org/wiki/Random_number_generation#Computational_methods for instance. Locking for each rand() call would be a concurrency disaster.

来源：https://stackoverflow.com/questions/4234480/concurrent-random-number-generation

标签

c++

random

openmp