Python: Given a set of N elements, choose k at random, m times

给你一囗甜甜゛ 提交于 2019-12-23 13:07:53

问题


Given a set of N elements, I want to choose m random, non-repeating subsets of k elements.

If I was looking to generate all the N choose k combinations, I could have used itertools.combination, so one way to do what I m asking would be:

import numpy as np
import itertools
n=10
A = np.arange(n)
k=4
m=5
result = np.random.permutation([x for x in itertools.permutations(A,k)])[:m]
print(result)

The problem is of course that this code first generates all the possible permutations, and that this can be quite expensive.

Another suboptimal solution would be to choose each time a single permutation at random (e.g. choose-at-random-from-combinations, then sort to get permutation), and discard it if it has already been selected.

Is there a better way to do this?


回答1:


Your second solution seems to be the only practical way to do it. It will work well unless k is close to n and m is "large", in which case there will be more repetitions.

I added a count of the tries needed to get the samples we need. For m=50, with n=10 and k=4, it takes usually less than 60 tries. You can see how it goes with the size of your population and your samples.

You can use random.sample to get a list of k values without replacement, then sort it and turn it into a tuple. So, we can use a set for keeping only unique results.

import random

n = 10
A = list(range(n))
k = 4
m = 5

samples = set()
tries = 0
while len(samples) < m:
    samples.add(tuple(sorted(random.sample(A, k))))
    tries += 1

print(samples)
print(tries)

# {(1, 4, 5, 9), (0, 3, 6, 8), (0, 4, 7, 8), (3, 5, 7, 9), (1, 2, 3, 4)}
# 6
# 6 tries this time !



回答2:


The simplest way to do it is to random.shuffle(range) then take first k elements (need to be repeated until m valid samples are collected).

Of course this procedure cannot guarantee unique samples. You are to check a new sample against your historical hash if you really need it.

Since Pyton2.3, random.sample(range, k) can be used to produce a sample in a more efficient way



来源:https://stackoverflow.com/questions/48580361/python-given-a-set-of-n-elements-choose-k-at-random-m-times

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!