Use rand() to generate uniformly distributed floating point numbers on (a,b), [a,b), (a,b], and [a,b]

后端未结

关注

 5  1358

失恋的感觉 2021-01-06 06:27

I want to collect the \"best\" way to generate random numbers on all four types of intervals in one place. I\'m sick of Googling this. Search results turn up a lot of crap.

5条回答

予麋鹿 (楼主)

2021-01-06 07:06
This question is not ready for answering because the problem has been incompletely specified. In particular, no specification has been stated for how finely the set of values that can be generated should be distributed. For illustration, consider generating values for [0, 1], and consider a floating-point format with representable values:

0, 1/16, 2/16, 3/16, 4/16, 6/16, 8/16, 12/16, 1.

Several distributions over these values might be considered “uniform”:
- Select each with equal probability. This is uniform over the discrete values but does not have a uniform density over the real distances between the values.
- Select each with some probability proportional to the density of representable values in its vicinity.
- Select 0, 4/16, 8/16, 12/16, and 1 with equal probability, to maintain the same granularity over the interval.
I doubt the first of these was intended, and I will dismiss it. The second is similar to a suggestion by Steve Jessop, but it is still incompletely specified. Should 0 be selected with a probability proportional to the interval from it to the midpoint to the next point? (This would give a probability of 1/32.) Or should it be associated with an interval centered on it, from -1/32 to 1/32? (This would give it a probability of 1/17, presuming 1 were also allocated an interval extended 1/32 beyond itself.)

You might reason that this is a closed interval, so it should stop at 0 and at 1. But suppose we had, for some application, chopped a distribution over [0, 2] into the intervals [0, 1] and (1, 2]. We would want the union of distributions over the latter two intervals to equal the distribution over the former interval. So our distributions ought to mesh nicely.

The third case has similar issues. Perhaps, if we wish to preserve granularity like this, 0 should be selected with probability 1/8, the three points 1/4, 1/2, and 3/4 with probability 1/4 each, and 1 with probability 1/8.

Aside from these issues of specifying the desired properties of the generators, the code proposed by the questioner has some issues:
- Presuming that RAND_MAX+1 is a power of two (and thus dividing by it is “nice” in binary floating-point arithmetic), dividing by RAND_MAX or RAND_MAX+2 may cause some irregularities in the generated values. There may be odd quantizations in them.
- When 1/(RAND_MAX+1) ≤ 1/4 ULP(1), RAND_MAX/(RAND_MAX+1) will round up and return 1 when it should not because the interval is [0, 1). (“ULP(1)” means the unit of least precision for the value 1 in the float-point format being used.) (This will not have been observed in tests with long double where RAND_MAX fits within the bits of the significand, but it will occur, for example, where RAND_MAX is 2147483647 and the floating-point type is float, with its 24-bit significand.)
- Multiplying by (b-a) and adding a introduces rounding errors, the consequences of which must be evaluated. There are a number of cases, such as when b-a is small and a is large, when a and b straddle zero (thus causing loss of granularity near b even though finer results are representable), and so on.
- The lower bound of the results for (0, 1) is the floating-point value nearest 1/(RAND_MAX+2). This bound has no relationship to the fineness of the floating-point values or the desired distribution; it is simply an artifact of the implementation of rand. Values in (0, 1/(RAND_MAX+2)) are omitted without any cause stemming from the problem specification. A similar artifact may exist on the upper end (depending on the particular floating-point format, the rand implementation, and the interval endpoint, b).
I submit the reason the questioner encountered unsatisfying answers for this “simple” problem is that it is not a simple problem.
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...