numpy array sliced twice

强颜欢笑 提交于 2019-12-24 13:44:25

问题


I'm not sure I understand why this doesn't work :

a = np.zeros((10, ))

# first slicing array
pos1 = np.zeros((10, ), dtype=np.bool)
pos1[::2] = True

a[pos1] = 1.
print a
# returns [ 1.  0.  1.  0.  1.  0.  1.  0.  1.  0.]


# second slicing array
pos2 = np.zeros((5, ), dtype=np.bool)
pos2[::2] = True

a[pos1][pos2] = 2.

print a
# still returns [ 1.  0.  1.  0.  1.  0.  1.  0.  1.  0.]

why does the second slicing didn't affect the full array? I thought a[pos1] was just a "view" of the subpart of the original array... Am I missing something?

(this example is just a simple example with no real use, it is just to try to understand cause I'm using this kind of syntax a lot and I didn't expect this result)


回答1:


It's the same issue as in recent Numpy doesn't change value of an array element after masking

You are using a boolean mask, so a[pos1] is a copy, not a slice.

The first set works because it is a direct call to __setitem__:

a[pos1] = 1.
a.__setitem__(pos1) = 1

The second does not because the set applies to a[pos1], a copy:

a[pos1][pos2] = 2.
a.__getitem__(pos1).__setitem__(pos2)

a[::2][pos2]=3 does work because a[::2] is a slice - even though it produces the same values as a[pos1].

One way to check whether something is a copy or view is to look at the array's data pointer

 a.__array_interface__['data']
 a[pos1].__array_interface__['data'] # will be different
 a[::2].__array_interface__['data']  # should be the same



回答2:


Take a look at the python byte code (outputs of dis) when we define the following three functions:

In [187]: def b():
    a[pos1][pos2]=2
    return a

In [188]: dis.dis(b)
  2           0 LOAD_CONST               1 (2)
              3 LOAD_GLOBAL              0 (a)
              6 LOAD_GLOBAL              1 (pos1)
              9 BINARY_SUBSCR       
             10 LOAD_GLOBAL              2 (pos2)
             13 STORE_SUBSCR        

  3          14 LOAD_GLOBAL              0 (a)
             17 RETURN_VALUE        

In [189]: b()
Out[189]: array([ 1.,  0.,  1.,  0.,  1.,  0.,  1.,  0.,  1.,  0.])




In [190]: def c():
    e=a.copy()
    e[pos1][pos2]=2
    return e

In [191]: dis.dis(c)
  2           0 LOAD_GLOBAL              0 (a)
              3 LOAD_ATTR                1 (copy)
              6 CALL_FUNCTION            0
              9 STORE_FAST               0 (e)

  3          12 LOAD_CONST               1 (2)
             15 LOAD_FAST                0 (e)
             18 LOAD_GLOBAL              2 (pos1)
             21 BINARY_SUBSCR       
             22 LOAD_GLOBAL              3 (pos2)
             25 STORE_SUBSCR        

  4          26 LOAD_FAST                0 (e)
             29 RETURN_VALUE 

In [191]: c()
Out[191]: array([ 1.,  0.,  1.,  0.,  1.,  0.,  1.,  0.,  1.,  0.])




In [192]: def d():
    f=a[pos1]
    f[pos2]=2
    return f

In [193]: dis.dis(d)
  2           0 LOAD_GLOBAL              0 (a)
              3 LOAD_GLOBAL              1 (pos1)
              6 BINARY_SUBSCR       
              7 STORE_FAST               0 (f)

  3          10 LOAD_CONST               1 (2)
             13 LOAD_FAST                0 (f)
             16 LOAD_GLOBAL              2 (pos2)
             19 STORE_SUBSCR        

  4          20 LOAD_FAST                0 (f)
             23 RETURN_VALUE  

In [194]: d()
Out[194]: array([ 2.,  1.,  2.,  1.,  2.])

From the disassembled code, each time the a[pos1][pos2]=2 assignment is performed, it is indeed stored in the top of the stack but then, global (case 1) or the local (case 2) variables are returned instead. When you split the operations (case 3), the interpreter seems to all at sudden remember that it had just stored the value on the stack and does not need to reload it.



来源:https://stackoverflow.com/questions/32063014/numpy-array-sliced-twice

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!