Python has my_sample = random.sample(range(100), 10) to randomly sample without replacement from [0, 100).
Suppose I have sampled n>
This is a rewritten version of @necromancer's cool solution. Wraps it in a class to make it much easier to use correctly, and uses more dict methods to cut the lines of code.
from random import randrange
class Sampler:
def __init__(self, n):
self.n = n # number remaining from original range(n)
# i is a key iff i < n and i already returned;
# in that case, state[i] is a value to return
# instead of i.
self.state = dict()
def get(self):
n = self.n
if n <= 0:
raise ValueError("range exhausted")
result = i = randrange(n)
state = self.state
# Most of the fiddling here is just to get
# rid of state[n-1] (if it exists). It's a
# space optimization.
if i == n - 1:
if i in state:
result = state.pop(i)
elif i in state:
result = state[i]
if n - 1 in state:
state[i] = state.pop(n - 1)
else:
state[i] = n - 1
elif n - 1 in state:
state[i] = state.pop(n - 1)
else:
state[i] = n - 1
self.n = n-1
return result
Here's a basic driver:
s = Sampler(100)
allx = [s.get() for _ in range(100)]
assert sorted(allx) == list(range(100))
from collections import Counter
c = Counter()
for i in range(6000):
s = Sampler(3)
one = tuple(s.get() for _ in range(3))
c[one] += 1
for k, v in sorted(c.items()):
print(k, v)
and sample output:
(0, 1, 2) 1001
(0, 2, 1) 991
(1, 0, 2) 995
(1, 2, 0) 1044
(2, 0, 1) 950
(2, 1, 0) 1019
By eyeball, that distribution is fine (run a chi-squared test if you're skeptical). Some of the solutions here don't give each permutation with equal probability (even though they return each k-subset of n with equal probability), so are unlike random.sample() in that respect.