Division of sparse matrix

前端 未结 1 947
离开以前
离开以前 2020-12-12 08:06

I have a scipy.sparse matrix with 45671x45671 elements. In this matrix, some rows contain only \'0\' value.

My question is, how to divide each row values by the row

1条回答
  •  没有蜡笔的小新
    2020-12-12 08:40

    I have an M hanging around:

    In [241]: M
    Out[241]: 
    <6x3 sparse matrix of type ''
        with 6 stored elements in Compressed Sparse Row format>
    In [242]: M.A
    Out[242]: 
    array([[1, 0, 0],
           [0, 1, 0],
           [0, 0, 1],
           [0, 1, 0],
           [0, 0, 1],
           [1, 0, 0]], dtype=uint8)
    In [243]: M.sum(1)            # dense matrix
    Out[243]: 
    matrix([[1],
            [1],
            [1],
            [1],
            [1],
            [1]], dtype=uint32)
    In [244]: M/M.sum(1)      # dense matrix - full size of M
    Out[244]: 
    matrix([[ 1.,  0.,  0.],
            [ 0.,  1.,  0.],
            [ 0.,  0.,  1.],
            [ 0.,  1.,  0.],
            [ 0.,  0.,  1.],
            [ 1.,  0.,  0.]])
    

    That will explain the memory error - if M is so large that M.A produces a memory error.


    In [262]: S = sparse.csr_matrix(M.sum(1))
    In [263]: S.shape
    Out[263]: (6, 1)
    In [264]: M.shape
    Out[264]: (6, 3)
    In [265]: M/S
    ....
    ValueError: inconsistent shapes
    

    I'm not entirely sure what is going on here.

    Element wise multiplication works

    In [266]: M.multiply(S)
    Out[266]: 
    <6x3 sparse matrix of type ''
        with 6 stored elements in Compressed Sparse Row format>
    

    So it should work if I construct S as S = sparse.csr_matrix(1/M.sum(1))

    If some of the rows sum to zero, you have a division by zero problem.


    If I modify M to have 0 row

    In [283]: M.A
    Out[283]: 
    array([[1, 0, 0],
           [0, 1, 0],
           [0, 0, 0],
           [0, 1, 0],
           [0, 0, 1],
           [1, 0, 0]], dtype=uint8)
    In [284]: S = sparse.csr_matrix(1/M.sum(1))
    /usr/local/bin/ipython3:1: RuntimeWarning: divide by zero encountered in true_divide
      #!/usr/bin/python3
    In [285]: S.A
    Out[285]: 
    array([[  1.],
           [  1.],
           [ inf],
           [  1.],
           [  1.],
           [  1.]])
    In [286]: M.multiply(S)
    Out[286]: 
    <6x3 sparse matrix of type ''
        with 5 stored elements in Compressed Sparse Row format>
    In [287]: _.A
    Out[287]: 
    array([[ 1.,  0.,  0.],
           [ 0.,  1.,  0.],
           [ 0.,  0.,  0.],
           [ 0.,  1.,  0.],
           [ 0.,  0.,  1.],
           [ 1.,  0.,  0.]])
    

    This isn't the best M to demonstrate this on, but it suggests a useful approach. The row sum will be dense, so you can clean up its inverse using the usual dense array approaches.

    0 讨论(0)
提交回复
热议问题