问题
Suppose I have two lists, L and M. Now I want to know if they share an element. Which would be the fastest way of asking (in python) if they share an element? I don't care which elements they share, or how many, just if they share or not.
For example, in this case
L = [1,2,3,4,5,6]
M = [8,9,10]
I should get False, and here:
L = [1,2,3,4,5,6]
M = [5,6,7]
I should get True.
I hope the question's clear. Thanks!
Manuel
回答1:
Or more concisely
if set(L) & set(M):
# there is an intersection
else:
# no intersection
If you really need True
or False
bool(set(L) & set(M))
After running some timings, this seems to be a good option to try too
m_set=set(M)
any(x in m_set for x in L)
If the items in M or L are not hashable you have to use a less efficient approach like this
any(x in M for x in L)
Here are some timings for 100 item lists. Using sets is considerably faster when there is no intersection, and a bit slower when there is a considerable intersection.
M=range(100)
L=range(100,200)
timeit set(L) & set(M)
10000 loops, best of 3: 32.3 µs per loop
timeit any(x in M for x in L)
1000 loops, best of 3: 374 µs per loop
timeit m_set=frozenset(M);any(x in m_set for x in L)
10000 loops, best of 3: 31 µs per loop
L=range(50,150)
timeit set(L) & set(M)
10000 loops, best of 3: 18 µs per loop
timeit any(x in M for x in L)
100000 loops, best of 3: 4.88 µs per loop
timeit m_set=frozenset(M);any(x in m_set for x in L)
100000 loops, best of 3: 9.39 µs per loop
# Now for some random lists
import random
L=[random.randrange(200000) for x in xrange(1000)]
M=[random.randrange(200000) for x in xrange(1000)]
timeit set(L) & set(M)
1000 loops, best of 3: 420 µs per loop
timeit any(x in M for x in L)
10 loops, best of 3: 21.2 ms per loop
timeit m_set=set(M);any(x in m_set for x in L)
1000 loops, best of 3: 168 µs per loop
timeit m_set=frozenset(M);any(x in m_set for x in L)
1000 loops, best of 3: 371 µs per loop
回答2:
To avoid the work of constructing the intersection, and produce an answer as soon as we know that they intersect:
m_set = frozenset(M)
return any(x in m_set for x in L)
Update: gnibbler tried this out and found it to run faster with set() in place of frozenset(). Whaddayaknow.
回答3:
First of all, if you do not need them ordered, then switch to the set
type.
If you still need the list type, then do it this way: 0 == False
len(set.intersection(set(L), set(M)))
回答4:
That's the most generic and efficient in a balanced way I could come up with (comments should make the code easy to understand):
import itertools, operator
def _compare_product(list1, list2):
"Return if any item in list1 equals any item in list2 exhaustively"
return any(
itertools.starmap(
operator.eq,
itertools.product(list1, list2)))
def do_they_intersect(list1, list2):
"Return if any item is common between list1 and list2"
# do not try to optimize for small list sizes
if len(list1) * len(list2) <= 100: # pick a small number
return _compare_product(list1, list2)
# first try to make a set from one of the lists
try: a_set= set(list1)
except TypeError:
try: a_set= set(list2)
except TypeError:
a_set= None
else:
a_list= list1
else:
a_list= list2
# here either a_set is None, or we have a_set and a_list
if a_set:
return any(itertools.imap(a_set.__contains__, a_list))
# try to sort the lists
try:
a_list1= sorted(list1)
a_list2= sorted(list2)
except TypeError: # sorry, not sortable
return _compare_product(list1, list2)
# they could be sorted, so let's take the N+M road,
# not the N*M
iter1= iter(a_list1)
iter2= iter(a_list2)
try:
item1= next(iter1)
item2= next(iter2)
except StopIteration: # one of the lists is empty
return False # ie no common items
while 1:
if item1 == item2:
return True
while item1 < item2:
try: item1= next(iter1)
except StopIteration: return False
while item2 < item1:
try: item2= next(iter2)
except StopIteration: return False
HTH.
来源:https://stackoverflow.com/questions/2197482/efficiently-knowing-if-intersection-of-two-list-is-empty-or-not-in-python