Given a list of tuples where each tuple consists of a probability and an item I\'d like to sample an item according to its probability. For example, give the list [ (.3, \'a\'),
Just inspired of sholte's very straightforward (and correct) answer: I'll just demonstrate how easy it will be to extend it to handle arbitrary items, like:
In []: s= array([.3, .4, .3]).cumsum().searchsorted(sample(54))
In []: c, _= histogram(s, bins= arange(4))
In []: [item* c[i] for i, item in enumerate('abc')]
Out[]: ['aaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbbbbbb', 'cccccccccccccccc']
Update:
Based on the feedback of phant0m, it turns out that an even more straightforward solution can be implemented based on multinomial, like:
In []: s= multinomial(54, [.3, .4, .3])
In []: [item* s[i] for i, item in enumerate('abc')]
Out[]: ['aaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbbbbbbb', 'cccccccccccc']
IMHO here we have a nice summary of empirical cdf and multinomial based sampling yielding similar results. So, in a summary, pick it up one which suits best for your purposes.