问题
If you have a sparse matrix X:
>> X = csr_matrix([[0,2,0,2],[0,2,0,1]])
>> print type(X)
>> print X.todense()
<class 'scipy.sparse.csr.csr_matrix'>
[[0 2 0 2]
[0 2 0 1]]
And a matrix Y:
>> print type(Y)
>> print text_scores
<class 'numpy.matrixlib.defmatrix.matrix'>
[[8]
[5]]
...How can you multiply each element of X by the rows of Y. For example:
[[0*8 2*8 0*8 2*8]
[0*5 2*5 0*5 1*5]]
or:
[[0 16 0 16]
[0 10 0 5]]
I've tired this but obviously it doesn't work as the dimensions dont match:
Z = X.data * Y
回答1:
Unfortunatly the .multiply method of the CSR matrix seems to densify the matrix if the other one is dense. So this would be one way avoiding that:
# Assuming that Y is 1D, might need to do Y = Y.A.ravel() or such...
# just to make the point that this works only with CSR:
if not isinstance(X, scipy.sparse.csr_matrix):
raise ValueError('Matrix must be CSR.')
Z = X.copy()
# simply repeat each value in Y by the number of nnz elements in each row:
Z.data *= Y.repeat(np.diff(Z.indptr))
This does create some temporaries, but at least its fully vectorized, and it does not densify the sparse matrix.
For a COO matrix the equivalent is:
Z.data *= Y[Z.row] # you can use np.take which is faster then indexing.
For a CSC matrix the equivalent would be:
Z.data *= Y[Z.indices]
回答2:
I had same problem. Personally I didn't find the documentation of scipy.sparse very helpful, neither found function that handles it directly. So I tried to write it myself and this solved for me:
Z = X.copy()
for row_y_idx in range(Y.shape[0]):
Z.data[Z.indptr[row_y_idx]:Z.indptr[row_y_idx+1]] *= Y[row_y_idx, 0]
The idea is: for each element of Y in position row_y_idx-th, perform a scalar multiplication with the row_y_idx-th row of X. More info about accessing elements in CSR matrices here (where data is A, IA is indptr).
Given X and Y as you defined:
import numpy as np
import scipy.sparse as sps
X = sps.csr_matrix([[0,2,0,2],[0,2,0,1]])
Y = np.matrix([[8], [5]])
Z = X.copy()
for row_y_idx in range(Y.shape[0]):
Z.data[Z.indptr[row_y_idx]:Z.indptr[row_y_idx+1]] *= Y[row_y_idx, 0]
print(type(Z))
print(Z.todense())
The output is the same as yours:
<class 'scipy.sparse.csr.csr_matrix'>
[[ 0 16 0 16]
[ 0 10 0 5]]
来源:https://stackoverflow.com/questions/12237954/multiplying-elements-in-a-sparse-array-with-rows-in-matrix