I\'m looking for a reasonable definition of a function weighted_sample that does not return just one random index for a list of given weights (which would be so
Sample is pretty fast. So unless you have a lot of megabytes to deal with, sample() should be fine.
On my machine it took 1.655 seconds to procduce 1000 samples out of 10000000 of length 100. And it took 12.98 seconds for traversing 100000 samples of length 100 from 10000000 elements.
from random import sample,random
from time import time
def generate(n1,n2,n3):
w = [random() for x in range(n1)]
print len(w)
samples = list()
for i in range(0,n2):
s = sample(w,n3)
samples.append(s)
return samples
start = time()
size_set = 10**7
num_samples = 10**5
length_sample = 100
samples = generate(size_set,num_samples,length_sample)
end = time()
allsum=0
for row in samples:
sum = reduce(lambda x, y: x+y,row)
allsum+=sum
print 'sum of all elements',allsum
print '%f seconds for %i samples of %i length %i'%((end-start),size_set,num_sam\
ples,length_sample)