可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
The basic question is: What happens under the hood when doing: a[i] += b
?
Given the following:
import numpy as np a = np.arange(4) i = a > 0 i = array([False, True, True, True], dtype=bool)
I understand that:
a[i] = x
is the same as a.__setitem__(i, x)
, which assigns directly to the items indicated by i
a += x
is the same as a.__iadd__(x)
, which does the addition in place
But what happens when I do:
a[i] += x
Specifically:
- Is this the same as
a[i] = a[i] + x
? (which is not an in-place operation) - Does it make a difference in this case if
i
is: - an
int
index, or - an
ndarray
, or - a
slice
object
Background
The reason I started delving into this is that I encountered a non-intuitive behavior when working with duplicate indices:
a = np.zeros(4) x = np.arange(4) indices = np.zeros(4,dtype=np.int) # duplicate indices a[indices] += x a = array([ 3., 0., 0., 0.])
More interesting stuff about duplicate indices in this question.
回答1:
The first thing you need to realise is that a += x
doesn't map exactly to a.__iadd__(x)
, instead it maps to a = a.__iadd__(x)
. Notice that the documentation specifically says that in-place operators return their result, and this doesn't have to be self
(although in practice, it usually is). This means a[i] += x
trivially maps to:
a.__setitem__(i, a.__getitem__(i).__iadd__(x))
So, the addition technically happens in-place, but only on a temporary object. There is still potentially one less temporary object created than if it called __add__
, though.
回答2:
Actually that has nothing to do with numpy. There is no "set/getitem in-place" in python, these things are equivalent to a[indices] = a[indices] + x
. Knowing that, it becomes pretty obvious what is going on. (EDIT: As lvc writes, actually the right hand side is in place, so that it is a[indices] = (a[indices] += x)
if that was legal syntax, that has largly the same effect though)
Of course a += x
actually is in-place, by mapping a to the np.add
out
argument.
It has been discussed before and numpy cannot do anything about it as such. Though there is an idea to have a np.add.at(array, index_expression, x)
to at least allow such operations.
回答3:
As Ivc explains, there is no in-place item add method, so under the hood it uses __getitem__
, then __iadd__
, then __setitem__
. Here's a way to empirically observe that behavior:
import numpy class A(numpy.ndarray): def __getitem__(self, *args, **kwargs): print "getitem" return numpy.ndarray.__getitem__(self, *args, **kwargs) def __setitem__(self, *args, **kwargs): print "setitem" return numpy.ndarray.__setitem__(self, *args, **kwargs) def __iadd__(self, *args, **kwargs): print "iadd" return numpy.ndarray.__iadd__(self, *args, **kwargs) a = A([1,2,3]) print "about to increment a[0]" a[0] += 1
It prints
about to increment a[0] getitem iadd setitem
回答4:
I think the major difference here is that in-place operators may return the same reference, but the effect is different in NumPy than Python.
Start with Python
>>> a = 1 >>> b = a >>> a is b True
These are the same reference.
>>> a += 4 >>> a 5 >>> b 1
In place addition creates a new reference.
Now for NumPy
>>> import numpy as np >>> a = np.array([1, 2, 3], float) >>> b = a >>> a is b True
Again these are the same reference, but in place operators have a different effect.
>>> a += 4 >>> a array([ 5., 6., 7.]) >>> b array([ 5., 6., 7.])
In place addition of an ndarray updates the reference. This is not the same as calling numpy.add
which creates a copy in a new reference.
>>> a = a + 4 >>> a array([ 9., 10., 11.]) >>> b array([ 5., 6., 7.])
In-place operations on borrowed references
The danger here is if the reference is passed to a different scope.
>>> def f(x): ... x += 4 ... return x
The argument reference to x
is passed into the scope of f
which does not make a copy and in fact changes the value at that reference and passes it back.
>>> f(a) array([ 13., 14., 15.]) >>> f(a) array([ 17., 18., 19.]) >>> f(a) array([ 21., 22., 23.]) >>> f(a) array([ 25., 26., 27.])
This can be confusing so only use in-place operators on references that belong to the current scope and be careful of borrowed references.