Python list intersection efficiency: generator or filter()?

独自空忆成欢 提交于 2019-12-01 05:18:18

Neither of these. The best way is to use sets.

list1 = [1,2,3,4]
list2 = [3,4,5,6]
result = set(list1).intersection(list2)

Sets are iterable, so no need to convert the result into anything.

Your solution has a complexity of O(m*n), where m and n are the respective lengths of the two lists. You can improve the complexity to O(m+n) using a set for one of the lists:

s = set(list1)
result = [x for x in list2 if x in s]

In cases where speed matters more than readability (that is, almost never), you can also use

result = filter(set(a).__contains__, b)

which is about 20 percent faster than the other solutions on my machine.

for the case of lists, the most efficient way is to use:

result = set(list1).intersection(list2)

as mentioned, but for numpy arrays, intersection1d function is more efficient:

import numpy as np
result = np.intersection1d(list1, list2)

Especially, when you know that the lists don't have duplicate values, you can use it as:

result = np.intersection1d(list1, list2, assume_unique=True)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!