Weird numpy.sum behavior when adding zeros

一曲冷凌霜 提交于 2019-12-04 08:56:13

问题


I understand how mathematically-equivalent arithmentic operations can result in different results due to numerical errors (e.g. summing floats in different orders).

However, it surprises me that adding zeros to sum can change the result. I thought that this always holds for floats, no matter what: x + 0. == x.

Here's an example. I expected all the lines to be exactly zero. Can anybody please explain why this happens?

M = 4  # number of random values
Z = 4  # number of additional zeros
for i in range(20):
    a = np.random.rand(M)
    b = np.zeros(M+Z)
    b[:M] = a
    print a.sum() - b.sum()

-4.4408920985e-16
0.0
0.0
0.0
4.4408920985e-16
0.0
-4.4408920985e-16
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
2.22044604925e-16
0.0
4.4408920985e-16
4.4408920985e-16
0.0

It seems not to happen for smaller values of M and Z.

I also made sure a.dtype==b.dtype.

Here is one more example, which also demonstrates python's builtin sum behaves as expected:

a = np.array([0.1,      1.0/3,      1.0/7,      1.0/13, 1.0/23])
b = np.array([0.1, 0.0, 1.0/3, 0.0, 1.0/7, 0.0, 1.0/13, 1.0/23])
print a.sum() - b.sum()
=> -1.11022302463e-16
print sum(a) - sum(b)
=> 0.0

I'm using numpy V1.9.2.


回答1:


Short answer: You are seeing the difference between

a + b + c + d

and

(a + b) + (c + d)

which because of floating point inaccuracies is not the same.

Long answer: Numpy implements pair-wise summation as an optimization of both speed (it allows for easier vectorization) and rounding error.

The numpy sum-implementation can be found here (function pairwise_sum_@TYPE@). It essentially does the following:

  1. If the length of the array is less than 8, a regular for-loop summation is performed. This is why the strange result is not observed if W < 4 in your case - the same for-loop summation will be used in both cases.
  2. If the length is between 8 and 128, it accumulates the sums in 8 bins r[0]-r[7] then sums them by ((r[0] + r[1]) + (r[2] + r[3])) + ((r[4] + r[5]) + (r[6] + r[7])).
  3. Otherwise, it recursively sums two halves of the array.

Therefore, in the first case you get a.sum() = a[0] + a[1] + a[2] + a[3] and in the second case b.sum() = (a[0] + a[1]) + (a[2] + a[3]) which leads to a.sum() - b.sum() != 0.



来源:https://stackoverflow.com/questions/30998305/weird-numpy-sum-behavior-when-adding-zeros

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!