AVX/SSE version of xorshift128+

前端 未结 2 1370
醉酒成梦
醉酒成梦 2020-12-06 06:24

I am trying to make the fastest possible high quality RNG. Having read http://xorshift.di.unimi.it/ , xorshift128+ seems like a good option. The C code is

#         


        
2条回答
  •  佛祖请我去吃肉
    2020-12-06 06:49

    XorShift is indeed a good choice. It is so good, so fast and requires so little state that I'm surprised to see so little adoption. It should be the standard generator on all platforms. I have implemented it myself 8 years ago and even then it could generate 800MB/s of random bytes.

    You cannot use vector instructions to speed up generating a single random number. There is too little instruction-level parallelism in those few instructions.

    But you can easily speed up generating N numbers where N is the vector size of your target instruction set. Just run N generators in parallel. Keep state for N generators and generate N numbers at the same time.

    If client code demands numbers one at a time you could keep a buffer of N (or more) numbers. If the buffer is empty you fill it using vector instructions. If the buffer is not empty you just return the next number.

提交回复
热议问题