Intersection of subelements of multiple list of lists in dictionary

问题

I have a number of lists of lists stored in a dictionary. I want to find the intersection of the sub-lists (i.e. intersection of dict[i][j]) for all keys of the dictionary. )

For example, if the dictionary stored sets of tuples instead, I could use the code:

set.intersection(*[index[key] for key in all_keys])

What is an efficient way to do this? One way I tried was to first convert each list of lists into a set of tuples and then taking the intersection of those, but this is rather clunky.

Example:

Suppose the dictionary of lists of lists is

dict = {} 
dict['A'] = [[1, 'charlie'], [2, 'frankie']] 
dict['B'] = [[1, 'charlie'], [2, 'chuck']]
dict['C'] = [[2, 'chuck'], [1, 'charlie']]

then I want to return

[1, 'charlie']

(maybe as a tuple, doesn't have to be list)

EDIT: I just found an okay way to do it, but it's not very 'pythonic'

def search(index, search_words): 
    rv = {tuple(t) for t in index[search_words[0]]}
    for word in search_words: 
        rv = rv.intersection({tuple(t) for t in index[word]})
    return rv

回答1:

Let's call your dictionary of lists of lists d:

>>> d = {'A': [[1, 'charlie'], [2, 'frankie']], 'B': [[1, 'charlie'], [2, 'chuck']], 'C': [[2, 'chuck'], [1, 'charlie']]}

I am calling it d because dict is a builtin and we would prefer not to overwrite it.

Now, find the intersection:

>>> set.intersection( *[ set(tuple(x) for x in d[k]) for k in d ] )
set([(1, 'charlie')])

How it works

set(tuple(x) for x in d[k])

For key k, this forms a set out of tuples of the elements in d[k]. Taking k='A' for example:
```
>>> k='A'; set(tuple(x) for x in d[k])
set([(2, 'frankie'), (1, 'charlie')])
```

[ set(tuple(x) for x in d[k]) for k in d ]

This makes a list of the sets from the step above. Thus:

>>> [ set(tuple(x) for x in d[k]) for k in d ]
[set([(2, 'frankie'), (1, 'charlie')]),
 set([(2, 'chuck'), (1, 'charlie')]),
 set([(2, 'chuck'), (1, 'charlie')])]

set.intersection( *[ set(tuple(x) for x in d[k]) for k in d ] )

This gathers the intersection of the three sets as above:
```
>>>set.intersection( *[ set(tuple(x) for x in d[k]) for k in d ] )
set([(1, 'charlie')])
```

回答2:

[item for item in A if item in B+C]

回答3:

Your usecase demands the use of reduce function. But, quoting the BDFL's take on reduce,

So now reduce(). This is actually the one I've always hated most, because, apart from a few examples involving + or *, almost every time I see a reduce() call with a non-trivial function argument, I need to grab pen and paper to diagram what's actually being fed into that function before I understand what the reduce() is supposed to do. So in my mind, the applicability of reduce() is pretty much limited to associative operators, and in all other cases it's better to write out the accumulation loop explicitly.

So, what you have got is already 'pythonic'.

I would write the program like this

>>> d = {'A': [[1, 'charlie'], [2, 'frankie']],
... 'B': [[1, 'charlie'], [2, 'chuck']],
... 'C': [[2, 'chuck'], [1, 'charlie']]}
>>> values = (value for value in d.values())
>>> result = {tuple(item) for item in next(values)}
>>> for item in values:
...    result &= frozenset(tuple(items) for items in item)
>>> result
set([(1, 'charlie')])

回答4:

from collections import defaultdict    
d = {'A': [[1, 'charlie'], [2, 'frankie']], 'B': [[1, 'charlie'], [2, 'chuck']], 'C': [[2, 'chuck'], [1, 'charlie']]}

frequency = defaultdict(set)
for key, items in d.iteritems():
    for pair in items:
        frequency[tuple(pair)].add(key)
output = [
    pair for pair, occurrances in frequency.iteritems() if len(d) == len(occurrances)
]
print output

来源：https://stackoverflow.com/questions/28145319/intersection-of-subelements-of-multiple-list-of-lists-in-dictionary

标签

python

list

set