binarize a sparse matrix in python in a different way

时光怂恿深爱的人放手 提交于 2019-12-11 07:23:45

问题


Assume I have a matrix like:

4 0 3 5
0 2 6 0
7 0 1 0

I want it binarized as:

0 0 0 0
0 1 0 0
0 0 1 0

That is set threshold equal to 2, any element greater than the threshold is set to 0, any element less or equal than the threshold(except 0) is set to 1.

Can we do this on python's csr_matrix or any other sparse matrix?

I know scikit-learn offer Binarizer to replace values below or equal to the threshold by 0, above it by 1.


回答1:


When dealing with a sparse matrix, s, avoid inequalities that include zero since a sparse matrix (if you're using it appropriately) should have a great many zeros and forming an array of all the locations which are zero would be huge. So avoid s <= 2 for example. Use inequalities that select away from zero instead.

import numpy as np
from scipy import sparse

s = sparse.csr_matrix(np.array([[4, 0, 3, 5],
         [0, 2, 6, 0],
         [7, 0, 1, 0]]))

print(s)
# <3x4 sparse matrix of type '<type 'numpy.int64'>'
#   with 7 stored elements in Compressed Sparse Row format>

s[s > 2] = 0
s[s != 0] = 1

print(s.todense())

yields

matrix([[0, 0, 0, 0],
        [0, 1, 0, 0],
        [0, 0, 1, 0]])



回答2:


You can use numpy.where for this:

>>> import numpy as np
>>> import scipy.sparse
>>> mat = scipy.sparse.csr_matrix(np.array([[4, 0, 3, 5],
         [0, 2, 6, 0],
         [7, 0, 1, 0]])).todense()
>>> np.where(np.logical_and(mat <= 2, mat !=0), 1, 0)
matrix([[0, 0, 0, 0],
        [0, 1, 0, 0],
        [0, 0, 1, 0]])



回答3:


There might be very efficient way to do this but it can be achieved using simple function and list operations as below

def binarized(matrix, threshold):
    for row in matrix:
        for each in range(len(matrix)+1):
            if row[each] > threshold:
                row[each] = 0
            elif row[each] != 0:
                row[each] = 1
    return matrix


matrix = [[4, 0, 3, 5],
          [0, 2, 6, 0],
          [7, 0, 1, 0]]

print binarized(matrix, 2)

Yeilds :

[[0, 0, 0, 0],
 [0, 1, 0, 0],
 [0, 0, 1, 0]]



回答4:


import numpy as np                                                                                            

x = np.array([[4, 0, 3, 5],                                                                                   
              [0, 2, 6, 0],                                                                                   
              [7, 0, 1, 0]])                                                                                  

threshold = 2                                                                                                  
x[x<=0]=threshold+1                                                                                            
x[x<=threshold]=1                                                                                              
x[x>threshold]=0                                                                                               
print x

output:

[[0 0 0 0]
 [0 1 0 0]
 [0 0 1 0]]


来源:https://stackoverflow.com/questions/27729810/binarize-a-sparse-matrix-in-python-in-a-different-way

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!