问题
In C++11, the use of arma_rng::set_seed_random()
generates a bottleneck. I show a way to reproduce it.
Consider this simple code:
#include <armadillo> // Load Armadillo library.
using namespace arma;
int main()
{
bool jj = true;
while ( jj == true ){
arma_rng::set_seed_random(); // Set the seed to generate random numbers.
double rnd_number = randu<double>(); // Generate a random number.
}
}
I compiled it with
g++ -std=c++11 -Wall -g bayesian_estimation.cpp -o bayesian_estimation -O2 -larmadillo
When I run the executable in a terminal, I see that one of the cores is handling it with a CPU% close to 100%. If I run more instances of it, the CPU% of each corresponding process is reduced, but no new (and idle!) cores are used. I illustrate this kind of behavior in detail in this question.
Why is this happening?
回答1:
I would assume that Armadillo takes the seed for set_seed_random()
from the pool of true random numbers that is maintained by the OS (e.g. /dev/random on most *NIX OS). Since this needs a physical source of entropy (usually, the timing of keystrokes, network events, other interrupt sources is used), this pool is finite and can be exhausted faster than new random numbers can be generated.
And in your case, I would assume that one executable running at full speed is depleting the pool at roughly the same rate that new entropy is added. As soon as you add a second, third, ..., they stall while waiting for new random numbers to enter the pool.
回答2:
Take a look at the code for a pseudo random number generator and you'll find out that instantiating it and/or giving it a new seed can be a rather costly process. You should generally only instantiate/seed one for each thread and use it for the rest of the threads life.
#include <armadillo> // Load Armadillo library.
using namespace arma;
int main()
{
bool jj = true;
arma_rng::set_seed_random(); // Set the seed to generate random numbers.
while ( jj == true ){
double rnd_number = randu<double>(); // Generate a random number.
}
}
It looks like arma_rng::set_seed_random() uses a variety of fallbacks unless ARMA_USE_CXX11
is defined. My guess is that it gets lucky when when trying /dev/urandom
. Do man urandom
for more info about that.
来源:https://stackoverflow.com/questions/53209631/c11-parallelization-bottleneck-in-armadillos-set-seed-random