I would like to intersect two lists in Python (2.7). I need the result to be iterable:
list1 = [1,2,3,4]
list2 = [3,4,5,6]
result = (3,4) # any kind of iterable
Providing a full iteration will be performed first thing after the intersection, which of the following is more efficient?
Using a generator:
result = (x for x in list1 if x in list2)
Using filter():
result = filter(lambda x: x in list2, list1)
Other suggestions?
Thanks in advance,
Amnon
Neither of these. The best way is to use sets.
list1 = [1,2,3,4]
list2 = [3,4,5,6]
result = set(list1).intersection(list2)
Sets are iterable, so no need to convert the result into anything.
Your solution has a complexity of O(m*n)
, where m
and n
are the respective lengths of the two lists. You can improve the complexity to O(m+n)
using a set for one of the lists:
s = set(list1)
result = [x for x in list2 if x in s]
In cases where speed matters more than readability (that is, almost never), you can also use
result = filter(set(a).__contains__, b)
which is about 20 percent faster than the other solutions on my machine.
for the case of lists, the most efficient way is to use:
result = set(list1).intersection(list2)
as mentioned, but for numpy arrays, intersection1d
function is more efficient:
import numpy as np
result = np.intersection1d(list1, list2)
Especially, when you know that the lists don't have duplicate values, you can use it as:
result = np.intersection1d(list1, list2, assume_unique=True)
来源:https://stackoverflow.com/questions/6369527/python-list-intersection-efficiency-generator-or-filter