Non biased return a list of n random positive numbers (>=0) so that their sum == total_sum

后端 未结 7 1218
天涯浪人
天涯浪人 2020-11-30 12:45

I\'m either looking for an algorithm or a suggestion to improve my code to generate a list of random numbers that their sum equals some arbitrary number. With my code below

相关标签:
7条回答
  • 2020-11-30 12:50

    Why not just generate the right number of uniformly distributed random numbers, tot them up and scale ?

    EDIT: To be a bit clearer: you want N numbers which sum to S ? So generate N uniformly distributed random numbers on the interval [0,1) or whatever your RNG produces. Add them up, they will total s (say) whereas you want them to total S, so multiply each number by S/s. Now the numbers are uniformly randomly distributed on [0,S/s) I think.

    0 讨论(0)
  • 2020-11-30 12:50

    Here's how I would do it:

    1. Generate n-1 random numbers, all in the range [0,max]
    2. Sort those numbers
    3. For each pair made up of the i-th and (i+1)-th number in sorted list, create an interval (i,i+1) and compute its length. The last interval will start at the last number and end at max and the first interval will start at 0 and end at the first number in the list.

    Now, the lengths of those intervals will always sum up to max, since they simply represent segments inside [0,max].

    Code (in Python):

    #! /usr/bin/env python
    import random
    
    def random_numbers(n,sum_to):
        values=[0]+[random.randint(0,sum_to) for i in xrange(n-1)]+[sum_to]
        values.sort()
        intervals=[values[i+1]-values[i] for i in xrange(len(values)-1)]
        return intervals
    
    if __name__=='__main__':
        print random_numbers(5,100)
    
    0 讨论(0)
  • 2020-11-30 12:56

    I ran into this problem and specifically needed integers. An answer is to use the multinomial.

    import numpy.random, numpy
    total_sum = 20
    n = 6
    
    v = numpy.random.multinomial(total_sum, numpy.ones(n)/n)
    

    As the multinomial documentation explains, you have rolled a fair six-sided dice twenty times. v contains six numbers indicating the number of times each side of the dice came up. Naturally the elements of v have to sum to twenty. Here, six is n and twenty is total_sum.

    With the multinomial, you can simulate an unfair dice as well, which is very useful in some cases.

    0 讨论(0)
  • 2020-11-30 13:00

    The following is quite simple, and returns uniform results:

    def gen_list(numbs, limit_sum):
        limits = sorted([random.uniform(0, limit_sum) for _ in xrange(numbs-1)])
        limits = [0] + limits + [limit_sum]
        return [x1-x0 for (x0, x1) in zip(limits[:-1], limits[1:])]
    

    The idea is simply that if you need, say, 5 numbers between 0 and 20, you can simply put 4 "limits" between 0 and 20, and you get a partition of the (0, 20) interval. The random numbers that you want are simply the lengths of the 5 intervals in the sorted list [0, random1, random2, random3, random4, 20].

    PS: oops! looks like it's the same idea as MAK's response, albeit coded without using indexes!

    0 讨论(0)
  • 2020-11-30 13:01

    You could keep a running total rather than having to call sum(my_sum) repeatedly.

    0 讨论(0)
  • 2020-11-30 13:02

    All right, we're going to tackle the problem assuming the requirement is to generate a random vector of length N that is uniformly distributed over the allowed space, restated as follows:

    Given

    • a desired length L,
    • a desired total sum S,
    • a range of allowed values [0,B] for each scalar value,

    generate a random vector V of length N such that the random variable V is uniformly distributed throughout its permitted space.


    We can simplify the problem by noting that we can calculate V = U * S where U is a similar random vector with desired total sum 1, and a range of allowed values [0,b] where b = B/S. The value b must be between 1/N and 1.


    First consider N = 3. The space of allowed values {U} is a portion of a plane perpendicular to the vector [1 1 1] that passes through the point [1/3 1/3 1/3] and which lies inside the cube whose components range between 0 and b. This set of points {U} is shaped like a hexagon.

    (TBD: picture. I can't generate one right now, I need access to MATLAB or another program that can do 3D plots. My installation of Octave can't.)

    It is best to use an orthonormal weighting matrix W (see my other answer) with one vector = [1 1 1]/sqrt(3). One such matrix is

    octave-3.2.3:1> A=1/sqrt(3)
       A =  0.57735
    octave-3.2.3:2> K=1/sqrt(3)/(sqrt(3)-1)
       K =  0.78868
    octave-3.2.3:3> W = [A A A; A 1-K -K; A -K 1-K]
       W =
    
         0.57735   0.57735   0.57735
         0.57735   0.21132  -0.78868
         0.57735  -0.78868   0.21132
    

    which, again, is orthonormal (W*W = I)

    If you consider the points of the cube [0 0 b],[0 b b],[0 b 0],[b b 0],[b 0 0], and [b 0 b] these form a hexagon and are all a distance of b*sqrt(2/3) from the cube's diagonal. These do not satisfy the problem in question, but are useful in a minute. The other two points [0 0 0] and [b b b] are on the cube's diagonal.

    The orthonormal weighting matrix W allows us to generate points that are uniformly distributed within {U}, because orthonormal matrices are coordinate transformations that rotate/reflect and do not scale or skew.

    We will generate points that are uniformly distributed in the coordinate system defined by the 3 vectors of W. The first component is the axis of the diagonal of the cube. The sum of U's components depends completely upon this axis and not at all on the others. Therefore the coordinate along this axis is forced to be 1/sqrt(3) which corresponds to the point [1/3, 1/3, 1/3].

    The other two components are in directions perpendicular to the cube's diagonal. Since the maximum distance from the diagonal is b*sqrt(2/3), we will generate uniformly distributed numbers (u,v) between -b*sqrt(2/3) and +b*sqrt(2/3).

    This gives us a random variable U' = [1/sqrt(3) u v]. We then compute U = U' * W. Some of the resulting points will be outside the allowable range (each component of U must be between 0 and b), in which case we reject that and start over.

    In other words:

    1. Generate independent random variables u and v that are each uniformly distributed between -b*sqrt(2/3) and +b*sqrt(3).
    2. Calculate the vector U' = [1/sqrt(3) u v]
    3. Compute U = U' * W.
    4. If any of U's components are outside the range [0,b], reject this value and go back to step 1.
    5. Calculate V = U * S.

    The solution is similar for higher dimensions (uniformly distributed points within a portion of the hyperplane perpendicular to a hypercube's main diagonal):

    Precalculate a weighting matrix W of rank N.

    1. Generate independent random variables u1, u2, ... uN-1 each uniformly distributed between -b*k(N) and +b*k(N).
    2. Calculate the vector U' = [1/N u1, u2, ... uN-1]
    3. Compute U = U' * W. (there are shortcuts to actually having to construct and multiply by W.)
    4. If any of U's components are outside the range [0,b], reject this value and go back to step 1.
    5. Calculate V = U * S.

    The range k(N) is a function of N that represents the maximum distance of the vertices of a hypercube of side 1 from its main diagonal. I'm not sure of the general formula but it's sqrt(2/3) for N = 3, sqrt(6/5) for N = 5, there's probably a formula for it somewhere.

    0 讨论(0)
提交回复
热议问题