发表新帖

发表新帖

Quick method to enumerate two big arrays?

前端未结

关注

 3  853

死守一世寂寞 2021-01-16 07:56

I have two big arrays to work on. But let\'s take a look on the following simplified example to get the idea:

I would like to find if an element in data1

3条回答

没有蜡笔的小新 (楼主)

2021-01-16 08:39
Because your data is all integers, you can use a dictionary (hash table), time is 0.55 seconds for the same data as in Paul's answer. This won't necessarily find all copies of pairings between a and b (i.e. if a and b themselves contain duplicates), but it's easy enough to modify this to do that or to make a second pass afterward (over just the matched items) to check for other occurrences of those vectors in the data.
```
import numpy as np

def intersect1(a, b):
    a_d = {}
    for i, x in enumerate(a):
        a_d[x] = i
    for i, y in enumerate(b):
        if y in a_d:
            yield a_d[y], i

from time import perf_counter
a = list(tuple(x) for x in list(np.random.randint(0, 100000, (1000000, 2))))
b = list(tuple(x) for x in list(np.random.randint(0, 100000, (1000000, 2))))
t = perf_counter(); print(list(intersect1(a, b))); s = perf_counter()
print(s-t)
```
For comparison, Paul's takes 2.46s on my machine.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题