What is the fastest way to map group names of numpy array to indices?

前端 未结 3 372
不思量自难忘°
不思量自难忘° 2020-12-17 20:48

I\'m working with 3D pointcloud of Lidar. The points are given by numpy array that looks like this:

points = np.array([[61651921, 416326074, 39805], [6160525         


        
相关标签:
3条回答
  • 2020-12-17 20:59

    You could use Cython:

    %%cython -c-O3 -c-march=native -a
    #cython: language_level=3, boundscheck=False, wraparound=False, initializedcheck=False, cdivision=True, infer_types=True
    
    import math
    import cython as cy
    
    cimport numpy as cnp
    
    
    cpdef groupby_index_dict_cy(cnp.int32_t[:, :] arr):
        cdef cy.size_t size = len(arr)
        result = {}
        for i in range(size):
            key = arr[i, 0], arr[i, 1], arr[i, 2]
            if key in result:
                result[key].append(i)
            else:
                result[key] = [i]
        return result
    

    but it will not make you faster than what Pandas does, although it is the fastest after that (and perhaps the numpy_index based solution), and does not come with the memory penalty of it. A collection of what has been proposed so far is here.

    In OP's machine that should get close to ~12 sec execution time.

    0 讨论(0)
  • 2020-12-17 21:00

    Constant number of indices per group

    Approach #1

    We can perform dimensionality-reduction to reduce cubes to a 1D array. This is based on a mapping of the given cubes data onto a n-dim grid to compute the linear-index equivalents, discussed in detail here. Then, based on the uniqueness of those linear indices, we can segregate unique groups and their corresponding indices. Hence, following those strategies, we would have one solution, like so -

    N = 4 # number of indices per group
    c1D = np.ravel_multi_index(cubes.T, cubes.max(0)+1)
    sidx = c1D.argsort()
    indices = sidx.reshape(-1,N)
    unq_groups = cubes[indices[:,0]]
    
    # If you need in a zipped dictionary format
    out = dict(zip(map(tuple,unq_groups), indices))
    

    Alternative #1 : If the integer values in cubes are too large, we might want to do the dimensionality-reduction such that the dimensions with shorter extent are choosen as the primary axes. Hence, for those cases, we can modify the reduction step to get c1D, like so -

    s1,s2 = cubes[:,:2].max(0)+1
    s = np.r_[s2,1,s1*s2]
    c1D = cubes.dot(s)
    

    Approach #2

    Next up, we can use Cython-powered kd-tree for quick nearest-neighbor lookup to get nearest neighbouring indices and hence solve our case like so -

    from scipy.spatial import cKDTree
    
    idx = cKDTree(cubes).query(cubes, k=N)[1] # N = 4 as discussed earlier
    I = idx[:,0].argsort().reshape(-1,N)[:,0]
    unq_groups,indices = cubes[I],idx[I]
    

    Generic case : Variable number of indices per group

    We will extend the argsort based method with some splitting to get our desired output, like so -

    c1D = np.ravel_multi_index(cubes.T, cubes.max(0)+1)
    
    sidx = c1D.argsort()
    c1Ds = c1D[sidx]
    split_idx = np.flatnonzero(np.r_[True,c1Ds[:-1]!=c1Ds[1:],True])
    grps = cubes[sidx[split_idx[:-1]]]
    
    indices = [sidx[i:j] for (i,j) in zip(split_idx[:-1],split_idx[1:])]
    # If needed as dict o/p
    out = dict(zip(map(tuple,grps), indices))
    

    Using 1D versions of groups of cubes as keys

    We will extend earlier listed method with the groups of cubes as keys to simplify the process of dictionary creating and also make it efficient with it, like so -

    def numpy1(cubes):
        c1D = np.ravel_multi_index(cubes.T, cubes.max(0)+1)        
        sidx = c1D.argsort()
        c1Ds = c1D[sidx]
        mask = np.r_[True,c1Ds[:-1]!=c1Ds[1:],True]
        split_idx = np.flatnonzero(mask)
        indices = [sidx[i:j] for (i,j) in zip(split_idx[:-1],split_idx[1:])]
        out = dict(zip(c1Ds[mask[:-1]],indices))
        return out
    

    Next up, we will make use of numba package to iterate and get to the final hashable dictionary output. Going with it, there would be two solutions - One that gets the keys and values separately using numba and the main calling will zip and convert to dict, while the other one will create a numba-supported dict type and hence no extra work required by the main calling function.

    Thus, we would have first numba solution :

    from numba import  njit
    
    @njit
    def _numba1(sidx, c1D):
        out = []
        n = len(sidx)
        start = 0
        grpID = []
        for i in range(1,n):
            if c1D[sidx[i]]!=c1D[sidx[i-1]]:
                out.append(sidx[start:i])
                grpID.append(c1D[sidx[start]])
                start = i
        out.append(sidx[start:])
        grpID.append(c1D[sidx[start]])
        return grpID,out
    
    def numba1(cubes):
        c1D = np.ravel_multi_index(cubes.T, cubes.max(0)+1)
        sidx = c1D.argsort()
        out = dict(zip(*_numba1(sidx, c1D)))
        return out
    

    And second numba solution as :

    from numba import types
    from numba.typed import Dict
    
    int_array = types.int64[:]
    
    @njit
    def _numba2(sidx, c1D):
        n = len(sidx)
        start = 0
        outt = Dict.empty(
            key_type=types.int64,
            value_type=int_array,
        )
        for i in range(1,n):
            if c1D[sidx[i]]!=c1D[sidx[i-1]]:
                outt[c1D[sidx[start]]] = sidx[start:i]
                start = i
        outt[c1D[sidx[start]]] = sidx[start:]
        return outt
    
    def numba2(cubes):
        c1D = np.ravel_multi_index(cubes.T, cubes.max(0)+1)    
        sidx = c1D.argsort()
        out = _numba2(sidx, c1D)
        return out
    

    Timings with cubes.npz data -

    In [4]: cubes = np.load('cubes.npz')['array']
    
    In [5]: %timeit numpy1(cubes)
       ...: %timeit numba1(cubes)
       ...: %timeit numba2(cubes)
    2.38 s ± 14.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    2.13 s ± 25.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    1.8 s ± 5.95 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    

    Alternative #1 : We can achieve further speedup with numexpr for large arrays to compute c1D, like so -

    import numexpr as ne
    
    s0,s1 = cubes[:,0].max()+1,cubes[:,1].max()+1
    d = {'s0':s0,'s1':s1,'c0':cubes[:,0],'c1':cubes[:,1],'c2':cubes[:,2]}
    c1D = ne.evaluate('c0+c1*s0+c2*s0*s1',d)
    

    This would be applicable at all places that require c1D.

    0 讨论(0)
  • 2020-12-17 21:09

    You might just iterate and add the index of each element to the corresponding list.

    from collections import defaultdict
    
    res = defaultdict(list)
    
    for idx, elem in enumerate(cubes):
        #res[tuple(elem)].append(idx)
        res[elem.tobytes()].append(idx)
    

    Runtime can be further improved by using tobytes() instead of converting the key to a tuple.

    0 讨论(0)
提交回复
热议问题