How to find elements that are common to all lists in a nested list?

后端 未结 4 2034
孤城傲影
孤城傲影 2020-12-21 12:37

I have a large nested list and each list within the nested list contains a list of numbers that are formatted as floats. However every individual list in the nested list is

4条回答
  •  抹茶落季
    2020-12-21 12:52

    You can use reduce and set.intersection:

    >>> reduce(set.intersection, map(set, nested_list))
    set([2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0])
    

    Use itertools.imap for memory efficient solution.

    Timing Comparisons:

    >>> lis = [[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0],
                  [2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0],
                  [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0],
                  [2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0]]
    >>> %timeit set.intersection(*map(set, lis))
    100000 loops, best of 3: 12.5 us per loop
    >>> %timeit set.intersection(*(set(e) for e in lis))
    10000 loops, best of 3: 14.4 us per loop
    >>> %timeit reduce(set.intersection, map(set, lis))
    10000 loops, best of 3: 12.8 us per loop
    >>> %timeit reduce(set.intersection, imap(set, lis))
    100000 loops, best of 3: 13.1 us per loop
    >>> %timeit set.intersection(set(lis[0]), *islice(lis, 1, None))
    100000 loops, best of 3: 10.6 us per loop
    
    
    >>> lis = [[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0],
                  [2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0],
                  [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0],
                  [2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0]]*1000
    >>> %timeit set.intersection(*map(set, lis))
    10 loops, best of 3: 16.4 ms per loop
    >>> %timeit set.intersection(*(set(e) for e in lis))
    10 loops, best of 3: 15.8 ms per loop
    >>> %timeit reduce(set.intersection, map(set, lis))
    100 loops, best of 3: 16.3 ms per loop
    >>> %timeit reduce(set.intersection, imap(set, lis))
    10 loops, best of 3: 13.8 ms per loop
    >>> %timeit set.intersection(set(lis[0]), *islice(lis, 1, None))
    100 loops, best of 3: 8.4 ms per loop
    
    
    >>> lis = [[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0],              [2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0],
                  [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0],
                  [2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0]]*10**5
    >>> %timeit set.intersection(*map(set, lis))  
    1 loops, best of 3: 1.92 s per loop
    >>> %timeit set.intersection(*(set(e) for e in lis))
    1 loops, best of 3: 2.17 s per loop
    >>> %timeit reduce(set.intersection, map(set, lis))
    1 loops, best of 3: 2.14 s per loop
    >>> %timeit reduce(set.intersection, imap(set, lis))
    1 loops, best of 3: 1.52 s per loop
    >>> %timeit set.intersection(set(lis[0]), *islice(lis, 1, None))
    1 loops, best of 3: 913 ms per loop
    

    Conclusion:

    Steven Rumbalski's solution is clearly the best one in terms of efficiency.

提交回复
热议问题