Fastest way to sum over rows of sparse matrix

做~自己de王妃 提交于 2019-12-05 16:15:50

Starting approach

Adapting sparse solution from this post -

def sparse_matrix_mult_sparseX_mod1(X, rows):   
    nrows = rows.max()+1
    ncols = X.shape[1]
    nelem = nrows * ncols

    a,b = X.nonzero()
    ids = rows[a] + b*nrows
    sums = np.bincount(ids, X[a,b].A1, minlength=nelem)
    out = sums.reshape(ncols,-1).T
    return out

Benchmarking

Original approach #1 -

def app1(X, map_fn):
    col = scipy.arange(n)
    val = np.ones(n)
    S = scipy.sparse.csr_matrix( (val, (map_fn, col)), shape = (s,n))
    SX = S.dot(X)
    return SX

Timings and verification -

In [209]: # Inputs setup
     ...: s=10000
     ...: n=1000000
     ...: d=1000
     ...: density=1.0/500
     ...: 
     ...: X=scipy.sparse.rand(n,d,density=density,format="csr")
     ...: map_fn=np.random.randint(0, s, n)
     ...: 

In [210]: out1 = app1(X, map_fn)
     ...: out2 = sparse_matrix_mult_sparseX_mod1(X, map_fn)
     ...: print np.allclose(out1.toarray(), out2)
     ...: 
True

In [211]: %timeit app1(X, map_fn)
1 loop, best of 3: 517 ms per loop

In [212]: %timeit sparse_matrix_mult_sparseX_mod1(X, map_fn)
10 loops, best of 3: 147 ms per loop

To be fair, we should time the final dense array version from app1 -

In [214]: %timeit app1(X, map_fn).toarray()
1 loop, best of 3: 584 ms per loop

Porting to Numba

We could translate the binned counting step to numba, which might be beneficial for denser input matrices. One of the ways to do so would be -

from numba import njit

@njit
def bincount_mod2(out, rows, r, C, V):
    N = len(V)
    for i in range(N):
        out[rows[r[i]], C[i]] += V[i]
    return out

def sparse_matrix_mult_sparseX_mod2(X, rows):
    nrows = rows.max()+1
    ncols = X.shape[1]
    r,C = X.nonzero()

    V = X[r,C].A1
    out = np.zeros((nrows, ncols))
    return bincount_mod2(out, rows, r, C, V)

Timings -

In [373]: # Inputs setup
     ...: s=10000
     ...: n=1000000
     ...: d=1000
     ...: density=1.0/100 # Denser now!
     ...: 
     ...: X=scipy.sparse.rand(n,d,density=density,format="csr")
     ...: map_fn=np.random.randint(0, s, n)
     ...: 

In [374]: %timeit app1(X, map_fn)
1 loop, best of 3: 787 ms per loop

In [375]: %timeit sparse_matrix_mult_sparseX_mod1(X, map_fn)
1 loop, best of 3: 906 ms per loop

In [376]: %timeit sparse_matrix_mult_sparseX_mod2(X, map_fn)
1 loop, best of 3: 705 ms per loop

With the dense output from app1 -

In [379]: %timeit app1(X, map_fn).toarray()
1 loop, best of 3: 910 ms per loop
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!