Normalizing vector produces nan in Numpy

允我心安 提交于 2021-02-19 07:44:20

问题


I'm getting some strange behavior from scipy/numpy that I suspect is a bug but someone may know better? I've got a pair of long arrays which I'm breaking into frames which are of length 2-4 for debugging purposes. I want to normalize each pair of frames and take the dot product. The code that does it (with some debugging output) is:

   tf = numpy.copy(t_frame) / norm(t_frame)
   pf = numpy.copy(p_frame) / norm(p_frame)
   print "OPF:"
   print p_frame
   print "PF: "
   print pf
   print "TF norm is: " + str(norm(tf))
   print "PF norm is: " + str(norm(pf))
   print numpy.dot(tf, pf)
   return numpy.dot(tf, pf)

This does what I'd expect for a while (specifically giving a norm of 1 for tf and pf) but then I start to see lines like this:

OPF:

[ -91 -119 -137 -132]

PF:

[ nan nan nan nan]

What?? This can be normalized fine in a new Python window:

>>> p = [ -91, -119, -137, -132] 
>>> p / norm(p)
array([-0.37580532, -0.49143773, -0.56577285, -0.54512421])

For what it's worth, I've tried numpy.linalg.norm, scipy.linalg.norm, and defining a function to return the square root of the dot product.

Any ideas?

UPDATE: Thanks for the suggestions! I tried switching the dtype to float128 and am sadly getting similar behavior. I'm actually inclined to believe that it's a bug in Python rather than numpy at this point:

  1. If it were a straightforward overflow issue, it seems like I'd get it consistently with a given list. But the norm computes fine if I do it in a new python session.
  2. I tried rolling my own:

    def norm(v):
       return (  sum(numpy.array(v)*numpy.array(v)))**(0.5)
    

    This only uses numpy to represent the arrays. I still get the same issue, but later in the data set (and no runtime warnings). It's doing about 37000 of these computations.

  3. I'm actually computing the norm on two frames, a t_frame and a p_frame. The computation of one chokes if and only if the computation for the other one does.

Put together, I think there's some weird buffer overflow somewhere in the bowels of Python (2.7.9)??? I ultimately need these computations to be fast as well; so I'm thinking of just switching over to Cython for that computation.

Update 2: I tried really rolling my own:

def norm(v):
  sum = float(0)
  for i in range(len(v)):
    sum += v[i]**2
  return sum**(0.5)

and the problem disappears. So I would guess that it is a bug in numpy (1.9.0 on Gentoo Linux).


回答1:


It looks like this is a bug in numpy. I can reproduce the problem if the data type of the array is np.int16:

In [1]: np.__version__
Out[1]: '1.9.2'

In [2]: x = np.array([ -91, -119, -137, -132], dtype=np.int16)

In [3]: x
Out[3]: array([ -91, -119, -137, -132], dtype=int16)

In [4]: np.linalg.norm(x)
/Users/warren/anaconda/lib/python2.7/site-packages/numpy/linalg/linalg.py:2061: RuntimeWarning: invalid value encountered in sqrt
  return sqrt(sqnorm)
Out[4]: nan

The problem also occurs in the master branch of the development version of numpy. I created an issue here: https://github.com/numpy/numpy/issues/6128

If p_frame is, in fact, a 16 bit integer array, a simple work-around is something like:

x = np.asarray(p_frame, dtype=np.float64)
pf = x / norm(x)



回答2:


Following one of Warren's links, I get this warning:

In [1016]: np.linalg.norm(100000*np.ones(2).astype('int16'))
/usr/local/lib/python2.7/site-packages/numpy/linalg/linalg.py:2051: RuntimeWarning: invalid value encountered in sqrt
  return sqrt(add.reduce((x.conj() * x).real, axis=None))

For this x2, the inner expression is negative - the result of overflow in a small dtype.

In [1040]: x2=100000*np.ones(2).astype('int16')
In [1041]: np.add.reduce((x2.conj()*x2).real,axis=None)
Out[1041]: -1474836480

similarly with an x1:

In [1042]: x1
Out[1042]: array([ -9100, -11900, -13700, -13200], dtype=int16)
In [1043]: np.add.reduce((x1.conj()*x1).real,axis=None)
Out[1043]: -66128

If the sum of the 'dot' becomes too large for the dtype, it can be negative, producing a nan when passed through sqrt.

(I'm using 1.8.2 and 1.9.0 under linux).



来源:https://stackoverflow.com/questions/31685156/normalizing-vector-produces-nan-in-numpy

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!