问题
The following question is on python 3.6. Suppose I have lists of sets, for example
L1 = [{2,7},{2,7,8},{2,3,6,7},{1,2,4,5,7}]      
L2 = [{3,6},{1,3,4,6,7},{2,3,5,6,8}]      
L3 = [{2,5,7,8},{1,2,3,5,7,8}, {2,4,5,6,7,8}] 
I need to find all the intersection sets between each element of L1, L2, and L3. E.g.:
    {2,7}.intersection({3,6}).intersection({2,5,7,8})= empty  
    {2,7}.intersection({3,6}).intersection({1,2,3,5,7,8})= empty  
    {2,7}.intersection({3,6}).intersection({2,4,5,6,7,8})= empty  
    {2,7}.intersection({1,3,4,6,7}).intersection({2,5,7,8})= {7}  
    {2,7}.intersection({1,3,4,6,7}).intersection({1,2,3,5,7,8})= {7}  
    {2,7}.intersection({1,3,4,6,7}).intersection({2,4,5,6,7,8})= {7}
...............................
If we keep doing like this, we end up with the following set:
{{empty},{2},{3},{6},{7},{2,3},{2,5},{2,6},{2,8},{3,7},{4,7},{6,7}}
Suppose:
        - I have many lists L1, L2, L3,...Ln. And I do not know how many lists I have.
        - Each list L1, L2, L3..Ln are big, so I can not load all of them into the memory.
My question is: Is there any way to calculate that set sequentially, e.g., calculate between L1 and L2, then using result to calculate with L3, and so on...
回答1:
You can first calculate all possible intersections between L1 and L2, then calculate the intersections between that set and L3 and so on.
list_generator = iter([  # some generator that produces your lists 
    [{2,7}, {2,7,8}, {2,3,6,7}, {1,2,4,5,7}],      
    [{3,6}, {1,3,4,6,7}, {2,3,5,6,8}],      
    [{2,5,7,8}, {1,2,3,5,7,8}, {2,4,5,6,7,8}], 
])
# for example, you can read from a file:
# (adapt the format to your needs)
def list_generator_from_file(filename):
    with open(filename) as f:
        for line in f:
            yield list(map(lambda x: set(x.split(',')), line.strip().split('|')))
# list_generator would be then list_generator_from_file('myfile.dat')
intersections = next(list_generator)  # get first list
new_intersections = set()
for list_ in list_generator:
    for old in intersections:
        for new in list_:
            new_intersections.add(frozenset(old.intersection(new)))
    # at this point we don't need the current list any more
    intersections, new_intersections = new_intersections, set()
print(intersections)
Output looks like {frozenset({7}), frozenset({3, 7}), frozenset({3}), frozenset({6}), frozenset({2, 6}), frozenset({6, 7}), frozenset(), frozenset({8, 2}), frozenset({2, 3}), frozenset({1, 7}), frozenset({4, 7}), frozenset({2, 5}), frozenset({2})}, which matches what you have except for the {1,7} set you missed.
回答2:
You can use functools.reduce(set.intersection, sets) to handle variable inputs. And itertools.product(nested_list_of_sets) to generate combinations with one element from each of several sequences.
By using generator functions (yield) and lazy iterators such as itertools.product, you can reduce memory usage by orders of magnitude.
import itertools
import functools
nested_list_of_sets = [
    [{2,7}, {2,7,8}, {2,3,6,7}, {1,2,4,5,7}], 
    [{3,6}, {1,3,4,6,7}, {2,3,5,6,8}],
    [{2,5,7,8}, {1,2,3,5,7,8}, {2,4,5,6,7,8}],
]
def find_intersections(sets):
    """Take a nested sequence of sets and generate intersections"""
    for combo in itertools.product(*sets):
        yield (combo, functools.reduce(set.intersection, combo))
for input_sets, output_set in find_intersections(nested_list_of_sets):
    print('{:<55}  ->   {}'.format(repr(input_sets), output_set))
Output is
({2, 7}, {3, 6}, {8, 2, 5, 7})                           ->   set()
({2, 7}, {3, 6}, {1, 2, 3, 5, 7, 8})                     ->   set()
({2, 7}, {3, 6}, {2, 4, 5, 6, 7, 8})                     ->   set()
({2, 7}, {1, 3, 4, 6, 7}, {8, 2, 5, 7})                  ->   {7}
({2, 7}, {1, 3, 4, 6, 7}, {1, 2, 3, 5, 7, 8})            ->   {7}
({2, 7}, {1, 3, 4, 6, 7}, {2, 4, 5, 6, 7, 8})            ->   {7}
({2, 7}, {2, 3, 5, 6, 8}, {8, 2, 5, 7})                  ->   {2}
({2, 7}, {2, 3, 5, 6, 8}, {1, 2, 3, 5, 7, 8})            ->   {2}
# ... etc
Online demo on repl.it
回答3:
This may be what you are looking for:
res = {frozenset(frozenset(x) for x in (i, j, k)): i & j & k \
       for i in L1 for j in L2 for k in L3}
Explanation
- frozensetis required because- setis not hashable. Dictionary keys must be hashable.
- Cycle through every length-3 combination of items in L1, L2, L3.
- Calculate intersection via &operation, equivalent toset.intersection.
来源:https://stackoverflow.com/questions/49206395/find-intersection-sets-between-list-of-sets