All combinations of values from two lists representing a certain feature

自作多情 提交于 2020-02-14 02:27:07

问题


I have three lists:

a = [0,1,2]
b = [3,4,5]
c = [aab, abb, aaa]

How to create all three-element combinations? Where sequences from the list c tell you which list can be used to choose numbers for a given place in a given output sequence

For example (pseudocode):

for i=0 in range(len(c)):
    print: [0,1,3]
           [0,1,4]
             ...
           [0,2,5]
             ...
           [1,2,4]
           [1,2,5]

And the same for the rest of the i indexes. Where the values in individual sublistas can not be repeated. I will be very grateful for any tips.


回答1:


This generator function will handle 'ab' template strings with the a's and b's in any order, and the output lists will not contain repeated items if the a and b lists are disjoint. We use itertools.combinations to generate combinations of the required order, and combine the a and b combinations using itertools.product. We get them in the correct order by turning each a and b combination into an iterator and select from the correct iterator via a dictionary.

from itertools import combinations, product

def groups(a, b, c):
    for pat in c:
        acombo = combinations(a, pat.count('a'))
        bcombo = combinations(b, pat.count('b'))
        for ta, tb in product(acombo, bcombo):
            d = {'a': iter(ta), 'b': iter(tb)}
            yield [next(d[k]) for k in pat]

# tests

a = [0,1,2]
b = [3,4,5]

templates = ['aab', 'abb', 'aaa'], ['aba'], ['bab']

for c in templates:
    print('c', c)
    for i, t in enumerate(groups(a, b, c), 1):
        print(i, t)
    print()

output

c ['aab', 'abb', 'aaa']
1 [0, 1, 3]
2 [0, 1, 4]
3 [0, 1, 5]
4 [0, 2, 3]
5 [0, 2, 4]
6 [0, 2, 5]
7 [1, 2, 3]
8 [1, 2, 4]
9 [1, 2, 5]
10 [0, 3, 4]
11 [0, 3, 5]
12 [0, 4, 5]
13 [1, 3, 4]
14 [1, 3, 5]
15 [1, 4, 5]
16 [2, 3, 4]
17 [2, 3, 5]
18 [2, 4, 5]
19 [0, 1, 2]

c ['aba']
1 [0, 3, 1]
2 [0, 4, 1]
3 [0, 5, 1]
4 [0, 3, 2]
5 [0, 4, 2]
6 [0, 5, 2]
7 [1, 3, 2]
8 [1, 4, 2]
9 [1, 5, 2]

c ['bab']
1 [3, 0, 4]
2 [3, 0, 5]
3 [4, 0, 5]
4 [3, 1, 4]
5 [3, 1, 5]
6 [4, 1, 5]
7 [3, 2, 4]
8 [3, 2, 5]
9 [4, 2, 5]

I should mention that even though combinations returns iterators, and product happily takes iterators as arguments, it has to make lists from the iterators because it has to run over the iterator contents multiple times. So if the number of combinations is huge this can consume a fair amount of RAM.


If you want permutations instead of combinations, that's easy. We just call itertools.permutations instead of itertools.combinations.

from itertools import permutations, product

def groups(a, b, c):
    for pat in c:
        acombo = permutations(a, pat.count('a'))
        bcombo = permutations(b, pat.count('b'))
        for ta, tb in product(acombo, bcombo):
            d = {'a': iter(ta), 'b': iter(tb)}
            yield [next(d[k]) for k in pat]

# tests

a = [0,1,2]
b = [3,4,5]

templates = ['aaa'], ['abb'] 

for c in templates:
    print('c', c)
    for i, t in enumerate(groups(a, b, c), 1):
        print(i, t)
    print()

output

 c ['aaa']
1 [0, 1, 2]
2 [0, 2, 1]
3 [1, 0, 2]
4 [1, 2, 0]
5 [2, 0, 1]
6 [2, 1, 0]

c ['abb']
1 [0, 3, 4]
2 [0, 3, 5]
3 [0, 4, 3]
4 [0, 4, 5]
5 [0, 5, 3]
6 [0, 5, 4]
7 [1, 3, 4]
8 [1, 3, 5]
9 [1, 4, 3]
10 [1, 4, 5]
11 [1, 5, 3]
12 [1, 5, 4]
13 [2, 3, 4]
14 [2, 3, 5]
15 [2, 4, 3]
16 [2, 4, 5]
17 [2, 5, 3]
18 [2, 5, 4]

Finally, here's a version that handles any number of lists, and template strings of any length. It only accepts a single template string per call, but that shouldn't be an issue. You can also choose whether you want to generate permutations or combinations via an optional keyword arg.

from itertools import permutations, combinations, product

def groups(sources, template, mode='P'):
    func = permutations if mode == 'P' else combinations
    keys = sources.keys()
    combos = [func(sources[k], template.count(k)) for k in keys]
    for t in product(*combos):
        d = {k: iter(v) for k, v in zip(keys, t)}
        yield [next(d[k]) for k in template]

# tests

sources = {
    'a': [0, 1, 2],
    'b': [3, 4, 5],
    'c': [6, 7, 8],
}

templates = 'aa', 'abc', 'abba', 'cab'

for template in templates:
    print('\ntemplate', template)
    for i, t in enumerate(groups(sources, template, mode='C'), 1):
        print(i, t)

output

template aa
1 [0, 1]
2 [0, 2]
3 [1, 2]

template abc
1 [0, 3, 6]
2 [0, 3, 7]
3 [0, 3, 8]
4 [0, 4, 6]
5 [0, 4, 7]
6 [0, 4, 8]
7 [0, 5, 6]
8 [0, 5, 7]
9 [0, 5, 8]
10 [1, 3, 6]
11 [1, 3, 7]
12 [1, 3, 8]
13 [1, 4, 6]
14 [1, 4, 7]
15 [1, 4, 8]
16 [1, 5, 6]
17 [1, 5, 7]
18 [1, 5, 8]
19 [2, 3, 6]
20 [2, 3, 7]
21 [2, 3, 8]
22 [2, 4, 6]
23 [2, 4, 7]
24 [2, 4, 8]
25 [2, 5, 6]
26 [2, 5, 7]
27 [2, 5, 8]

template abba
1 [0, 3, 4, 1]
2 [0, 3, 5, 1]
3 [0, 4, 5, 1]
4 [0, 3, 4, 2]
5 [0, 3, 5, 2]
6 [0, 4, 5, 2]
7 [1, 3, 4, 2]
8 [1, 3, 5, 2]
9 [1, 4, 5, 2]

template cab
1 [6, 0, 3]
2 [7, 0, 3]
3 [8, 0, 3]
4 [6, 0, 4]
5 [7, 0, 4]
6 [8, 0, 4]
7 [6, 0, 5]
8 [7, 0, 5]
9 [8, 0, 5]
10 [6, 1, 3]
11 [7, 1, 3]
12 [8, 1, 3]
13 [6, 1, 4]
14 [7, 1, 4]
15 [8, 1, 4]
16 [6, 1, 5]
17 [7, 1, 5]
18 [8, 1, 5]
19 [6, 2, 3]
20 [7, 2, 3]
21 [8, 2, 3]
22 [6, 2, 4]
23 [7, 2, 4]
24 [8, 2, 4]
25 [6, 2, 5]
26 [7, 2, 5]
27 [8, 2, 5]



回答2:


from itertools import product, chain

setups = ['aab', 'abb', 'aaa']
sources = {
    'a': [0,1,2],
    'b': [3,4,5]
}

combinations = (product(*map(sources.get, setup)) for setup in setups) 

combinations is a nested lazy iterator (i.e. nothing is stored in memory and calculated, yet). If you want to get an iterator of lists

combinations = map(list, (product(*map(sources.get, setup)) for setup in setups))

Or you might want to flatten the result:

combinations = chain.from_iterable(product(*map(sources.get, setup)) for setup in setups)



回答3:


If I understand it correctly, you can achieve the goal with a dictionary bookkeeping the correspondence of a character like "a" to a variable name a.

from collections import defaultdict

a = [0,1,2]
b = [3,4,5]
c = ["aab", "abb", "aaa"]
d = {"a": a, "b": b}
d2 = defaultdict(list)
for seq in c:
    l = []
    for idx, v in enumerate(seq):
        l.append(d[v][idx]) 
    print(l)
    d2[seq].append(l)
# Out:
#[0, 1, 5]
#[0, 4, 5]
#[0, 1, 2]
print(d2)
# defaultdict(<class 'list'>, {'aab': [[0, 1, 5]], 'abb': [[0, 4, 5]], 'aaa': [[0, 1, 2]]})



回答4:


Put the lists in a dictionary so you can access them with strings.
Use the characters in each sequence to determine which lists to use.
Use itertools.product to get the combinations.

import itertools, collections
from pprint import pprint
d = {'a':[0,1,2], 'b':[3,4,5]}
c = ['aab', 'abb', 'aaa']

def f(t):
    t = collections.Counter(t)
    return max(t.values()) < 2

for seq in c:
    data = (d[char] for char in seq)
    print(f'sequence: {seq}')
    pprint(list(filter(f, itertools.product(*data))))
    print('***************************')

Result for sequence 'abb':

sequence: abb
[(0, 3, 4),
 (0, 3, 5),
 (0, 4, 3),
 (0, 4, 5),
 (0, 5, 3),
 (0, 5, 4),
 (1, 3, 4),
 (1, 3, 5),
 (1, 4, 3),
 (1, 4, 5),
 (1, 5, 3),
 (1, 5, 4),
 (2, 3, 4),
 (2, 3, 5),
 (2, 4, 3),
 (2, 4, 5),
 (2, 5, 3),
 (2, 5, 4)]

edit to filter out tuples with duplicates


I like the idea of a callable dict that can be used with map. It could be used here.

class CallDict(dict):
    def __call__(self, key):
        return self[key]    #self.get(key)

e = CallDict([('a',[0,1,2]), ('b',[3,4,5])])

for seq in c:
    data = map(e, seq)
    print(f'sequence: {seq}')
    for thing in filter(f, itertools.product(*data)):
        print(thing)
    print('***************************')

I couldn't help myself, here is a generic version of @PM2Ring's solution/answer. Instead of filtering out unwanted items, it doesn't produce them in the first place.

d = {'a':[0,1,2], 'b':[3,4,5]}
c = ['aab', 'abb', 'aaa', 'aba']
def g(d, c):
    for seq in c:
        print(f'sequence: {seq}')
        counts = collections.Counter(seq)
##        data = (itertools.combinations(d[key],r) for key, r in counts.items())
        data = (itertools.permutations(d[key],r) for key, r in counts.items())
        for thing in itertools.product(*data):
            q = {key:iter(other) for key, other in zip(counts, thing)}
            yield [next(q[k]) for k in seq]

for t in g(d, c):
    print(t)



回答5:


It looks like you're looking for some way to programmatically call itertools.product

from itertools import product

d = {'a': [0,1,2],
     'b': [3,4,5]}
c = ['aab', 'abb', 'aaa']

for s in c:
    print(list(product(*[d[x] for x in s])))


来源:https://stackoverflow.com/questions/48121163/all-combinations-of-values-from-two-lists-representing-a-certain-feature

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!