Efficiently multiply a dense matrix by a sparse vector

问题

I am looking for an efficient way to multiply a dense matrix by a sparse vector, Av, where A is of size (M x N) and v is (N x 1). The vector v is a scipy.sparse.csc_matrix.

I have two methods I use at the moment:

In method 1, I pick off the non-zero values in v, say vi, and element-wise multiply vi with the corresponding column of A, then sum up these columns. So if y = Av, then y = A[:, 0]*v0 + ... + A[:, N]*vN, only for the non-zero i.

def dense_dot_sparse(dense_matrix, sparse_column):
    prod = np.zeros((dense_matrix.shape[0]))
    r, c = sparse_column.nonzero()
    indices = zip(r, c)
    for ind in indices:
        prod = prod + dense_matrix[:, ind[1]] * sparse_column[ind]
    return prod

In method 2, I perform the multiplication by simply making the sparse vector .todense() and use np.dot().

def dense_dot_sparse2(dense_matrix, sparse_column):
    return np.dot(dense_matrix, sparse_column.todense())

The typical size of A is (512 x 2048) and the sparsity of v varies between 1 to 200 non-zero entries. I choose which method to employ based on the sparsity of v. If the sparsity of v is ~ 200 non-zeros, method 1 takes ~45ms and method 2 takes ~5ms. But when v is very sparse, ~1 non-zero, then method 1 takes ~1ms whereas method 2 still takes 5ms. Checking the sparsity of v (.nnz) adds nearly another 0.2ms.

I have to perform about 1500 of these multiplications (after splitting up my data and multiprocessing), so the time adds up.

[EDIT: Adding a simple representative example

rows = 512
cols = 2048
sparsity = 0.001  # very sparse: 0.001 for ~ 1 non-zero, moderately sparse: 0.1 for ~ 200 non-zero
big_matrix = np.random.rand(rows, cols)  # use as dense matrix
col = np.random.rand(cols, 1)
col = np.array([i[0] if i < sparsity else 0.0 for i in col])
sparse_col = csc_matrix(col)  # use as sparse vector
print sparse_col.nnz

END EDIT]

I am looking for a single implementation that is fast for both very sparse and moderately sparse v.

来源：https://stackoverflow.com/questions/29871460/efficiently-multiply-a-dense-matrix-by-a-sparse-vector

标签

python

python-2.7

scipy

sparse-matrix