I am running in to RuntimeWarning: Invalid value encountered in divide
import numpy
a = numpy.random.rand((1000000, 100))
b = numpy.random.rand((1,100))
You could use angles[~np.isfinite(angles)] = ... to replace nan values with some other value.
For example:
In [103]: angles = dots/norms
In [104]: angles
Out[104]: array([[ nan, nan, nan, ..., nan, nan, nan]])
In [105]: angles[~np.isfinite(angles)] = -2
In [106]: angles
Out[106]: array([[-2., -2., -2., ..., -2., -2., -2.]])
Note that division by zero may result in infs, rather than nans,
In [140]: np.array([1, 2, 3, 4, 0])/np.array([1, 2, 0, -0., 0])
Out[140]: array([ 1., 1., inf, -inf, nan])
so it is better to call np.isfinite rather than np.isnan to identify the places where there was division by zero.
In [141]: np.isfinite(np.array([1, 2, 3, 4, 0])/np.array([1, 2, 0, -0., 0]))
Out[141]: array([ True, True, False, False, False], dtype=bool)
Note that if you only want the top ten values from an NumPy array, using the np.argpartition function may be quicker than fully sorting the entire array, especially for large arrays:
In [110]: N = 3
In [111]: x = np.array([50, 40, 30, 20, 10, 0, 100, 90, 80, 70, 60])
In [112]: idx = np.argpartition(-x, N)
In [113]: idx
Out[113]: array([ 6, 7, 8, 9, 10, 0, 1, 4, 3, 2, 5])
In [114]: x[idx[:N]]
Out[114]: array([100, 90, 80])
This shows np.argpartition is quicker for even only moderately large arrays:
In [123]: x = np.array([50, 40, 30, 20, 10, 0, 100, 90, 80, 70, 60]*1000)
In [124]: %timeit np.sort(x)[-N:]
1000 loops, best of 3: 233 µs per loop
In [125]: %timeit idx = np.argpartition(-x, N); x[idx[:N]]
10000 loops, best of 3: 53.3 µs per loop