matlab: different instances start with the same random seed

萝らか妹 提交于 2021-01-29 08:54:34

问题


Using MATLAB and trying to use a computer cluster to perform 100 repetitions of certain calculation with inherent stochastic nature. Each of those repetitions should include the same code, but with different random seed. It seems that

rng('shuffle')

recommended by documentation may not achieve this if all jobs start running at the same time (on different machines) as the seed used is an integer which seems to be initialized from time (it is monotonously increasing, seems like precision of 100th of a second.

The precision seems reasonable, but "collisions" are still very likely if running 100-1000 instances at the same time, thus corrupting the results statistical interpretation as independent.

Any way to avoid such collisions without manually giving each instance an "instance id" used as seed?


回答1:


Whatever you choose for the seed, it can only take on a 32-bit value, even if it will initialize a generator with a bigger state, such as Mersenne Twister ('twister', 19937 bits). There are certain issues with 32-bit seeds, as discussed in "C++ Seeding Surprises" by M. O'Neill. Presumably, the time-based seeds are likewise 32 bits long. A short seed means that only a limited number of pseudorandom sequences can be generated.

It appears that rng doesn't support seeds longer than 32 bits. On the other hand, recent versions of MATLAB support random number streams, which are designed, among other things, if you "want separate sources of randomness in a simulation". For your purposes, choose a generator that supports multiple streams, such as mrg32k3a, and create random number streams as follows (see also "Multiple Streams"):

[stream1, stream2]=RandStream.create('mrg32k3a','NumStreams',2)



回答2:


I usually try to get some serial numbers from the machine or HDD, e.g.

dos('wmic bios get serialnumber')

or

dos('wmic cpu')

ProcessorId e.g. "BFEBFBFF000506E3" is another one that could be used and be different across your cluster. Likely multicores thus use NumberOfCores to split and have different seeds, maybe.



来源:https://stackoverflow.com/questions/62890097/matlab-different-instances-start-with-the-same-random-seed

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!