How do numpy's in-place operations (e.g. `+=`) work?

匿名 (未验证) 提交于 2019-12-03 02:31:01

问题:

The basic question is: What happens under the hood when doing: a[i] += b?

Given the following:

import numpy as np a = np.arange(4) i = a > 0 i = array([False,  True,  True,  True], dtype=bool) 

I understand that:

  • a[i] = x is the same as a.__setitem__(i, x), which assigns directly to the items indicated by i
  • a += x is the same as a.__iadd__(x), which does the addition in place

But what happens when I do:

a[i] += x 

Specifically:

  1. Is this the same as a[i] = a[i] + x? (which is not an in-place operation)
  2. Does it make a difference in this case if i is:
    • an int index, or
    • an ndarray, or
    • a slice object

Background

The reason I started delving into this is that I encountered a non-intuitive behavior when working with duplicate indices:

a = np.zeros(4) x = np.arange(4) indices = np.zeros(4,dtype=np.int)  # duplicate indices a[indices] += x a = array([ 3.,  0.,  0.,  0.]) 

More interesting stuff about duplicate indices in this question.

回答1:

The first thing you need to realise is that a += x doesn't map exactly to a.__iadd__(x), instead it maps to a = a.__iadd__(x). Notice that the documentation specifically says that in-place operators return their result, and this doesn't have to be self (although in practice, it usually is). This means a[i] += x trivially maps to:

a.__setitem__(i, a.__getitem__(i).__iadd__(x)) 

So, the addition technically happens in-place, but only on a temporary object. There is still potentially one less temporary object created than if it called __add__, though.



回答2:

Actually that has nothing to do with numpy. There is no "set/getitem in-place" in python, these things are equivalent to a[indices] = a[indices] + x. Knowing that, it becomes pretty obvious what is going on. (EDIT: As lvc writes, actually the right hand side is in place, so that it is a[indices] = (a[indices] += x) if that was legal syntax, that has largly the same effect though)

Of course a += x actually is in-place, by mapping a to the np.add out argument.

It has been discussed before and numpy cannot do anything about it as such. Though there is an idea to have a np.add.at(array, index_expression, x) to at least allow such operations.



回答3:

As Ivc explains, there is no in-place item add method, so under the hood it uses __getitem__, then __iadd__, then __setitem__. Here's a way to empirically observe that behavior:

import numpy  class A(numpy.ndarray):     def __getitem__(self, *args, **kwargs):         print "getitem"         return numpy.ndarray.__getitem__(self, *args, **kwargs)     def __setitem__(self, *args, **kwargs):         print "setitem"         return numpy.ndarray.__setitem__(self, *args, **kwargs)     def __iadd__(self, *args, **kwargs):         print "iadd"         return numpy.ndarray.__iadd__(self, *args, **kwargs)  a = A([1,2,3]) print "about to increment a[0]" a[0] += 1 

It prints

about to increment a[0] getitem iadd setitem 


回答4:

I think the major difference here is that in-place operators may return the same reference, but the effect is different in NumPy than Python.

Start with Python

>>> a = 1 >>> b = a >>> a is b True 

These are the same reference.

>>> a += 4 >>> a 5 >>> b 1 

In place addition creates a new reference.

Now for NumPy

>>> import numpy as np >>> a = np.array([1, 2, 3], float) >>> b = a >>> a is b True 

Again these are the same reference, but in place operators have a different effect.

>>> a += 4 >>> a array([ 5.,  6.,  7.]) >>> b array([ 5.,  6.,  7.]) 

In place addition of an ndarray updates the reference. This is not the same as calling numpy.add which creates a copy in a new reference.

>>> a = a + 4 >>> a array([  9.,  10.,  11.]) >>> b array([ 5.,  6.,  7.]) 

In-place operations on borrowed references

The danger here is if the reference is passed to a different scope.

>>> def f(x): ...     x += 4 ...     return x 

The argument reference to x is passed into the scope of f which does not make a copy and in fact changes the value at that reference and passes it back.

>>> f(a) array([ 13.,  14.,  15.]) >>> f(a) array([ 17.,  18.,  19.]) >>> f(a) array([ 21.,  22.,  23.]) >>> f(a) array([ 25.,  26.,  27.]) 

This can be confusing so only use in-place operators on references that belong to the current scope and be careful of borrowed references.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!