问题
Given an array, I want to normalize it such that each row sums to 1.
I currently have the following code:
import numpy
w = numpy.array([[0, 1, 0, 1, 0, 0],
[1, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 1, 0],
[0, 0, 0, 1, 0, 1],
[0, 1, 1, 0, 1, 0]], dtype = float)
def rownormalize(array):
i = 0
for row in array:
array[i,:] = array[i,:]/sum(row)
i += 1
I've two questions:
1) The code works, but I'm wondering if there's a more elegant way.
2) How can I convert the data type into a float if it's int? I tried
if array.dtype == int:
array.dtype = float
But it doesn't work.
回答1:
You can do 1) like that:
array /= array.sum(axis=1, keepdims=True)
and 2) like that:
array = array.astype(float)
回答2:
Divisions even though broadcasted across all elements could be expensive. An alternative with focus on performance, would be to pre-compute the reciprocal of row-summations and use those to perform broadcasted multiplications instead, like so -
w *= 1.0/w.sum(1,keepdims=1)
Runtime test -
In [588]: w = np.random.rand(3000,3000)
In [589]: out1 = w/w.sum(axis=1, keepdims=True) #@Julien Bernu's soln
In [590]: out2 = w*(1.0/w.sum(1,keepdims=1))
In [591]: np.allclose(out1,out2)
Out[591]: True
In [592]: %timeit w/w.sum(axis=1, keepdims=True) #@Julien Bernu's soln
10 loops, best of 3: 66.7 ms per loop
In [593]: %timeit w*(1.0/w.sum(1,keepdims=1))
10 loops, best of 3: 40 ms per loop
来源:https://stackoverflow.com/questions/40578781/normalizing-a-numpy-array