问题
I have data of the format (x_index, y_index, value) and I'm trying to create a CSR matrix using scipy (scipy.sparse.csr.csr_matrix).
For example, convert:
0 0 10
0 1 5
1 0 3
1 1 4
To the following:
10 5
3 4
I've read the documentation here: http://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html
However I'm still not clear which of the examples applies to my use case.
回答1:
If you can separate the input data into a sequence of row indices, a sequence of column indices and a corresponding sequence of value indices, you can use the fourth option shown in the csr_matrix docstring for creating the matrix.
For example, supposed you already have your data in a single array, data,
where the first two columns are the indices and the third column holds the values. E.g.
In [213]: data
Out[213]:
array([[ 0, 0, 10],
[ 0, 1, 5],
[ 1, 0, 3],
[ 1, 1, 4]])
Then you can create a CSR matrix as follows:
In [214]: a = csr_matrix((data[:, 2], (data[:, 0], data[:, 1])))
In [215]: a
Out[215]:
<2x2 sparse matrix of type '<type 'numpy.int64'>'
with 4 stored elements in Compressed Sparse Row format>
In [216]: a.A
Out[216]:
array([[10, 5],
[ 3, 4]])
Depending on your data, you might need to specify the shape explicitly. For example, here I use the same data, but in a 3x3 matrix:
In [217]: b = csr_matrix((data[:, 2], (data[:, 0], data[:, 1])), shape=(3, 3))
In [218]: b
Out[218]:
<3x3 sparse matrix of type '<type 'numpy.int64'>'
with 4 stored elements in Compressed Sparse Row format>
In [219]: b.A
Out[219]:
array([[10, 5, 0],
[ 3, 4, 0],
[ 0, 0, 0]])
来源:https://stackoverflow.com/questions/32800395/create-csr-matrix-from-x-index-y-index-value