Setting elements in .data attribute to zero unpleasant behaivor in scipy.sparse

杀马特。学长 韩版系。学妹 提交于 2019-12-10 18:34:52

问题


I getting unpleasant behavior when I set values in .data of csr_matrix to zero. Here is an example:

from scipy import sparse
a = sparse.csr_matrix([[0,0,2,0], [1,1,0,0],[0,3,0,0]])

Output:

>>> a.A
array([[0, 0, 2, 0],
       [1, 1, 0, 0],
       [0, 3, 0, 0]])
>>> a.data
array([2, 1, 1, 3])
>>> a.data[3] = 0   # setting one element to zero
>>> a.A
array([[0, 0, 2, 0],
       [1, 1, 0, 0],
       [0, 0, 0, 0]])
>>> a.data
array([2, 1, 1, 0]) # however, this zero is still considered part of data
                    # what I would like to see is:
                    # array([2, 1, 1])

>>> a.nnz           # also `nnz` tells me that there 4 non-zero elements 
                    # which is incorrect, I would like 3 as an output
4

>>> a.nonzero()     # nonzero method does follow the behavior I expected
(array([0, 1, 1], dtype=int32), array([2, 0, 1], dtype=int32))

What is the best practice in the above situation? Should setting elements of .data to zero be avoided? Is .nnz unreliable way find number of zeros?


回答1:


Sparse matrices in scipy (at least CSC and CSR) have an .eliminate_zeros() method to handle this situations. Run

a.eliminate_zeros()

every time you mess with a.data, and it should take care of it.



来源:https://stackoverflow.com/questions/19122024/setting-elements-in-data-attribute-to-zero-unpleasant-behaivor-in-scipy-sparse

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!