问题
Assume I have a matrix like:
4 0 3 5
0 2 6 0
7 0 1 0
I want it binarized as:
0 0 0 0
0 1 0 0
0 0 1 0
That is set threshold equal to 2, any element greater than the threshold is set to 0, any element less or equal than the threshold(except 0) is set to 1.
Can we do this on python's csr_matrix or any other sparse matrix?
I know scikit-learn offer Binarizer to replace values below or equal to the threshold by 0, above it by 1.
回答1:
When dealing with a sparse matrix, s
, avoid inequalities that include zero since a sparse matrix (if you're using it appropriately) should have a great many zeros and forming an array of all the locations which are zero would be huge. So avoid s <= 2
for example. Use inequalities that select away from zero instead.
import numpy as np
from scipy import sparse
s = sparse.csr_matrix(np.array([[4, 0, 3, 5],
[0, 2, 6, 0],
[7, 0, 1, 0]]))
print(s)
# <3x4 sparse matrix of type '<type 'numpy.int64'>'
# with 7 stored elements in Compressed Sparse Row format>
s[s > 2] = 0
s[s != 0] = 1
print(s.todense())
yields
matrix([[0, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0]])
回答2:
You can use numpy.where for this:
>>> import numpy as np
>>> import scipy.sparse
>>> mat = scipy.sparse.csr_matrix(np.array([[4, 0, 3, 5],
[0, 2, 6, 0],
[7, 0, 1, 0]])).todense()
>>> np.where(np.logical_and(mat <= 2, mat !=0), 1, 0)
matrix([[0, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0]])
回答3:
There might be very efficient way to do this but it can be achieved using simple function
and list
operations as below
def binarized(matrix, threshold):
for row in matrix:
for each in range(len(matrix)+1):
if row[each] > threshold:
row[each] = 0
elif row[each] != 0:
row[each] = 1
return matrix
matrix = [[4, 0, 3, 5],
[0, 2, 6, 0],
[7, 0, 1, 0]]
print binarized(matrix, 2)
Yeilds :
[[0, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0]]
回答4:
import numpy as np
x = np.array([[4, 0, 3, 5],
[0, 2, 6, 0],
[7, 0, 1, 0]])
threshold = 2
x[x<=0]=threshold+1
x[x<=threshold]=1
x[x>threshold]=0
print x
output:
[[0 0 0 0]
[0 1 0 0]
[0 0 1 0]]
来源:https://stackoverflow.com/questions/27729810/binarize-a-sparse-matrix-in-python-in-a-different-way