Algorithm for generating random numbers from a discrete distribution?

我只是一个虾纸丫 提交于 2019-12-21 20:47:05

问题


Design a fast algorithm to repeatedly generate numbers from the discrete distribution: Given an array a[] of non negative real numbers that sum to 1, the goal is to return index i with probability a[i]

I found this question in a an online algorithm book, Introduction to Programming in Java, chapter 4.2: Sorting and Searching (http://introcs.cs.princeton.edu/java/42sort/) .

the hint says:

Form an array s[] of cumulated sums such that s[i] is the sum of the first i elements of a[]. Now, generate a random real number r between 0 and 1, and use binary search to return the index i for which s[i] ≤ s[i+1].

some how I am not able to understand the hint and hence cant find the solution..


回答1:


There are many ways to answer this problem. This article describes numerous approaches, their strengths, their weaknesses, and their runtimes. It concludes with an algorithm that takes O(n) preprocessing time, then generates numbers in time O(1) each.

The particular approach you're looking for is described under "roulette wheel selection."

Hope this helps!




回答2:


Here's a Python algorithm which implements the 'Roulette Wheel' Technique. It's tough to explain without a graphic. The article linked to by templatetypedef should do well for that. Also, note that this algorithm doesn't actually require the weights to be normalized (they don't need to sum to 1), but this will work nonetheless.

import random

trials = 50
selected_indices = []

# weights on each index
distrib = [0.1, 0.4, 0.2, 0.3]

index = random.randrange(0, len(distrib) - 1)
max_weight = max(distrib)
B = 0
# generate 'trials' random indices
for i in range (trials):

    # increase B by a factor which is
    # guaranteed to be much larger than our largest weight
    B = B + random.uniform(0, 2 * max_weight)

    # continue stepping through wheel until B lands 'within' a weight
    while(B > distrib[index]):
        B = B - distrib[index]
        index = (index + 1) % len(distrib)
    selected_indices.append(index)

print("Randomly selected indices from {0} trials".format(trials))
print(selected_indices)



回答3:


This is a snippet from wakkerbot/megahal. Here the weights are (unsigned) ints, and their sum is in node->childsum. For maximum speed, the children are sorted (more or less) in descending order. (the weights are expected to have a power-law like distribution, with only a few high weights and many smaller ones)

    /*
     *          Choose a symbol at random from this context.
     *          weighted by ->thevalue
     */
    credit = urnd( node->childsum );
    for(cidx=0; 1; cidx = (cidx+1) % node->branch) {
        symbol = node->children[cidx].ptr->symbol;
        if (credit < node->children[cidx].ptr->thevalue) break;
        /* 20120203 if (node->children[cidx].ptr->thevalue == 0) credit--; */
        credit -= node->children[cidx].ptr->thevalue;
    }
done:
    // fprintf(stderr, "{+%u}", symbol );
    return symbol;



回答4:


Depending on the granularity, you can create an Index with 100, 1000 or 10000 Elements. Assume a distribution (a,b,c,d) with p=(10%, 20%, 30%, 40%), we create a Map:

val prob = Map ('a' -> 10, 'b' -> 20, 'c' -> 30, 'd' -> 40) 
val index = (for (e <- prob;
  i <- (1 to e._2)) yield e._1 ).toList 

index: List[Char] = List(a, a, a, a, a, a, a, a, a, a, 
b, b, b, b, b, b, b, b, b, b, b, b, b, b, b, b, b, b, b, b, 
c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, 
d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d, d)

We can now pick an element of desired probability very fast:

val x = index (random.nextInt (100))

x is now by 40% d, by 10% a and so on. Short setup, fast access.

The numbers don't even need to sum up to 100, but you have to calculate the range once, then:

val max = prob.map (_._2).sum 
val x = index (random.nextInt (max))


来源:https://stackoverflow.com/questions/10064344/algorithm-for-generating-random-numbers-from-a-discrete-distribution

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!