Reshape sparse matrix efficiently, Python, SciPy 0.12

六月ゝ 毕业季﹏ 提交于 2019-11-30 17:37:42

问题


In another post regarding resizing of a sparse matrix in SciPy the accepted answer works when more rows or columns are to be added, using scipy.sparse.vstack or hstack, respectively. In SciPy 0.12 the reshape or set_shape methods are still not implemented.

Are there some stabilished good practices to reshape a sparse matrix in SciPy 0.12? It would be nice to have some timing comparisons.


回答1:


As of SciPy 1.1.0, the reshape and set_shape methods have been implemented for all sparse matrix types. The signatures are what you would expect and are as identical to the equivalent methods in NumPy as feasible (e.g. you can't reshape to a vector or tensor).

Signature:

reshape(self, shape: Tuple[int, int], order: 'C'|'F' = 'C', copy: bool = False) -> spmatrix

Example:

>>> from scipy.sparse import csr_matrix
>>> A = csr_matrix([[0,0,2,0], [0,1,0,3]])
>>> print(A)
  (0, 2)    2
  (1, 1)    1
  (1, 3)    3
>>> B = A.reshape((4,2))
>>> print(B)
  (1, 0)    2
  (2, 1)    1
  (3, 1)    3
>>> C = A.reshape((4,2), order='F')
>>> print(C)
  (0, 1)    2
  (3, 0)    1
  (3, 1)    3

Full disclosure: I wrote the implementations.




回答2:


I don't know of any established good practices, so here's a fairly straight-forward reshape function for a coo_matrix. It converts its argument to a coo_matrix, so it will actual work for other sparse formats (but it returns a coo_matrix).

from scipy.sparse import coo_matrix


def reshape(a, shape):
    """Reshape the sparse matrix `a`.

    Returns a coo_matrix with shape `shape`.
    """
    if not hasattr(shape, '__len__') or len(shape) != 2:
        raise ValueError('`shape` must be a sequence of two integers')

    c = a.tocoo()
    nrows, ncols = c.shape
    size = nrows * ncols

    new_size =  shape[0] * shape[1]
    if new_size != size:
        raise ValueError('total size of new array must be unchanged')

    flat_indices = ncols * c.row + c.col
    new_row, new_col = divmod(flat_indices, shape[1])

    b = coo_matrix((c.data, (new_row, new_col)), shape=shape)
    return b

Example:

In [43]: a = coo_matrix([[0,10,0,0],[0,0,0,0],[0,20,30,40]])

In [44]: a.A
Out[44]: 
array([[ 0, 10,  0,  0],
       [ 0,  0,  0,  0],
       [ 0, 20, 30, 40]])

In [45]: b = reshape(a, (2,6))

In [46]: b.A
Out[46]: 
array([[ 0, 10,  0,  0,  0,  0],
       [ 0,  0,  0, 20, 30, 40]])

Now, I'm sure there are several regular contributors here who can come up with something better (faster, more memory efficient, less filling... :)




回答3:


I have one working example fro CSR matrix, but I cannot guarantee that it always works

flattening the matrix A:

    indices = zeros_like(A.indices)
    indices[A.indptr[1:-1]] = A.shape[1]
    indices = cumsum( indices)+A.indices
    A_flat = sparse.csc_matrix((T_rot.data, indices,[0,size(A)]),shape=(prod(A.shape),1))

reshaping the matrix A

    indices = zeros_like(A.indices)
    indices[A.indptr[1:-1]] = A.shape[1]
    indices = cumsum( indices)+A.indices

    indices %= N*A.shape[1]
    indptr = r_[0, where(diff(indices)<0)[0]+1, size(A)]
    A_reshaped = sparse.csc_matrix((A.data, indices,indptr),shape=(N*A.shape[1],A.shape[0]/N ))


来源:https://stackoverflow.com/questions/16511879/reshape-sparse-matrix-efficiently-python-scipy-0-12

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!