问题
In another post regarding resizing of a sparse matrix in SciPy the accepted answer works when more rows or columns are to be added, using scipy.sparse.vstack or hstack, respectively. In SciPy 0.12 the reshape or set_shape methods are still not implemented.
Are there some stabilished good practices to reshape a sparse matrix in SciPy 0.12? It would be nice to have some timing comparisons.
回答1:
As of SciPy 1.1.0, the reshape and set_shape methods have been implemented for all sparse matrix types. The signatures are what you would expect and are as identical to the equivalent methods in NumPy as feasible (e.g. you can't reshape to a vector or tensor).
Signature:
reshape(self, shape: Tuple[int, int], order: 'C'|'F' = 'C', copy: bool = False) -> spmatrix
Example:
>>> from scipy.sparse import csr_matrix
>>> A = csr_matrix([[0,0,2,0], [0,1,0,3]])
>>> print(A)
(0, 2) 2
(1, 1) 1
(1, 3) 3
>>> B = A.reshape((4,2))
>>> print(B)
(1, 0) 2
(2, 1) 1
(3, 1) 3
>>> C = A.reshape((4,2), order='F')
>>> print(C)
(0, 1) 2
(3, 0) 1
(3, 1) 3
Full disclosure: I wrote the implementations.
回答2:
I don't know of any established good practices, so here's a fairly straight-forward reshape function for a coo_matrix. It converts its argument to a coo_matrix, so it will actual work for other sparse formats (but it returns a coo_matrix).
from scipy.sparse import coo_matrix
def reshape(a, shape):
"""Reshape the sparse matrix `a`.
Returns a coo_matrix with shape `shape`.
"""
if not hasattr(shape, '__len__') or len(shape) != 2:
raise ValueError('`shape` must be a sequence of two integers')
c = a.tocoo()
nrows, ncols = c.shape
size = nrows * ncols
new_size = shape[0] * shape[1]
if new_size != size:
raise ValueError('total size of new array must be unchanged')
flat_indices = ncols * c.row + c.col
new_row, new_col = divmod(flat_indices, shape[1])
b = coo_matrix((c.data, (new_row, new_col)), shape=shape)
return b
Example:
In [43]: a = coo_matrix([[0,10,0,0],[0,0,0,0],[0,20,30,40]])
In [44]: a.A
Out[44]:
array([[ 0, 10, 0, 0],
[ 0, 0, 0, 0],
[ 0, 20, 30, 40]])
In [45]: b = reshape(a, (2,6))
In [46]: b.A
Out[46]:
array([[ 0, 10, 0, 0, 0, 0],
[ 0, 0, 0, 20, 30, 40]])
Now, I'm sure there are several regular contributors here who can come up with something better (faster, more memory efficient, less filling... :)
回答3:
I have one working example fro CSR matrix, but I cannot guarantee that it always works
flattening the matrix A:
indices = zeros_like(A.indices)
indices[A.indptr[1:-1]] = A.shape[1]
indices = cumsum( indices)+A.indices
A_flat = sparse.csc_matrix((T_rot.data, indices,[0,size(A)]),shape=(prod(A.shape),1))
reshaping the matrix A
indices = zeros_like(A.indices)
indices[A.indptr[1:-1]] = A.shape[1]
indices = cumsum( indices)+A.indices
indices %= N*A.shape[1]
indptr = r_[0, where(diff(indices)<0)[0]+1, size(A)]
A_reshaped = sparse.csc_matrix((A.data, indices,indptr),shape=(N*A.shape[1],A.shape[0]/N ))
来源:https://stackoverflow.com/questions/16511879/reshape-sparse-matrix-efficiently-python-scipy-0-12