Quick method to enumerate two big arrays?

前端 未结 3 853
死守一世寂寞
死守一世寂寞 2021-01-16 07:56

I have two big arrays to work on. But let\'s take a look on the following simplified example to get the idea:

I would like to find if an element in data1

3条回答
  •  没有蜡笔的小新
    2021-01-16 08:39

    Because your data is all integers, you can use a dictionary (hash table), time is 0.55 seconds for the same data as in Paul's answer. This won't necessarily find all copies of pairings between a and b (i.e. if a and b themselves contain duplicates), but it's easy enough to modify this to do that or to make a second pass afterward (over just the matched items) to check for other occurrences of those vectors in the data.

    import numpy as np
    
    def intersect1(a, b):
        a_d = {}
        for i, x in enumerate(a):
            a_d[x] = i
        for i, y in enumerate(b):
            if y in a_d:
                yield a_d[y], i
    
    from time import perf_counter
    a = list(tuple(x) for x in list(np.random.randint(0, 100000, (1000000, 2))))
    b = list(tuple(x) for x in list(np.random.randint(0, 100000, (1000000, 2))))
    t = perf_counter(); print(list(intersect1(a, b))); s = perf_counter()
    print(s-t)
    

    For comparison, Paul's takes 2.46s on my machine.

提交回复
热议问题