Making combinations in python

被刻印的时光 ゝ 提交于 2019-12-08 10:04:31
Gareth Rees

NPE's answer solves the problem by constructing the full list of subsets in memory and then sorting it. This requires O(2n) space and O(n2 2n) time. If this is unacceptable, then here's a way to generate the subsets in O(n) space and O(n 2n) time.

from itertools import combinations

def subsets(s, placeholder = None):
    """
    Generate the subsets of `s` in order of size.
    Use `placeholder` for missing elements (default: None).
    """
    s = list(s)
    n = len(s)
    r = range(n)
    for i in range(n + 1):
        for c in combinations(r, i):
            result = [placeholder] * n
            for j in c:
                result[j] = s[j]
            yield result

>>> from pprint import pprint
>>> pprint(list(subsets([23, 'M', 'Delhi', 'Muslim'])))
[[None, None, None, None],
 [23, None, None, None],
 [None, 'M', None, None],
 [None, None, 'Delhi', None],
 [None, None, None, 'Muslim'],
 [23, 'M', None, None],
 [23, None, 'Delhi', None],
 [23, None, None, 'Muslim'],
 [None, 'M', 'Delhi', None],
 [None, 'M', None, 'Muslim'],
 [None, None, 'Delhi', 'Muslim'],
 [23, 'M', 'Delhi', None],
 [23, 'M', None, 'Muslim'],
 [23, None, 'Delhi', 'Muslim'],
 [None, 'M', 'Delhi', 'Muslim'],
 [23, 'M', 'Delhi', 'Muslim']]

How about the following:

In [21]: attrib = (23, "M", "Delhi", "Muslim")

In [25]: comb = list(itertools.product(*((a, None) for a in attrib)))

In [26]: comb
Out[26]: 
[(23, 'M', 'Delhi', 'Muslim'),
 (23, 'M', 'Delhi', None),
 (23, 'M', None, 'Muslim'),
 (23, 'M', None, None),
 (23, None, 'Delhi', 'Muslim'),
 (23, None, 'Delhi', None),
 (23, None, None, 'Muslim'),
 (23, None, None, None),
 (None, 'M', 'Delhi', 'Muslim'),
 (None, 'M', 'Delhi', None),
 (None, 'M', None, 'Muslim'),
 (None, 'M', None, None),
 (None, None, 'Delhi', 'Muslim'),
 (None, None, 'Delhi', None),
 (None, None, None, 'Muslim'),
 (None, None, None, None)]

Now, if I understood your sorting requirements correctly, the following should do it:

In [27]: sorted(comb, key=lambda x:sum(v is not None for v in x))
Out[27]: 
[(None, None, None, None),
 (23, None, None, None),
 (None, 'M', None, None),
 (None, None, 'Delhi', None),
 (None, None, None, 'Muslim'),
 (23, 'M', None, None),
 (23, None, 'Delhi', None),
 (23, None, None, 'Muslim'),
 (None, 'M', 'Delhi', None),
 (None, 'M', None, 'Muslim'),
 (None, None, 'Delhi', 'Muslim'),
 (23, 'M', 'Delhi', None),
 (23, 'M', None, 'Muslim'),
 (23, None, 'Delhi', 'Muslim'),
 (None, 'M', 'Delhi', 'Muslim'),
 (23, 'M', 'Delhi', 'Muslim')]

I've used None where you used *, but it's trivial to use the latter.

Of course with 30 attributes you're looking at ~1 billion combinations, so the flattening of the list with subsequent sorting probably won't work. However, what can you usefully do with 1 billion entries anyway?

Look at itertools, it has a method for combinations

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!