I\'m confused about the results of numpy reshape operated on a view. In the following q.flags shows that it does not own the data, but q.base is neither x nor y, so what is
I like to use .__array_interface__.
In [811]: x.__array_interface__
Out[811]:
{'data': (149194496, False),
'descr': [('', '<f8')],
'shape': (4, 4),
'strides': None,
'typestr': '<f8',
'version': 3}
In [813]: y.__array_interface__
Out[813]:
{'data': (149194496, False),
'descr': [('', '<f8')],
'shape': (4, 4),
'strides': (8, 32),
'typestr': '<f8',
'version': 3}
In [814]: x.strides
Out[814]: (32, 8)
In [815]: y.strides
Out[815]: (8, 32)
Transpose was performed by reversing the strides. The base data pointer is the same.
In [817]: q.__array_interface__
Out[817]:
{'data': (165219304, False),
'descr': [('', '<f8')],
'shape': (16,),
'strides': None,
'typestr': '<f8',
'version': 3}
So the q data is a copy (different pointer). Strides (8,) means its elements are accessed by stepping from one f8 to the next. But a x.reshape(16) is a view of x - because its data can be accessed with a simple 8 step.
To access the original data in the q order, it would have to step 32 bytes 3 times (down x rows), then go back to the start and step 8 to the 2nd x column, followed by 3 row steps, etc. Since striding doesn't work this way, it has to work from a copy.
Note also that y[0,0] changes x[0,0], but q[0] is independent of both.
While OWNDATA for q is false, it is True for y.ravel() and y.flatten(). I suspect reshape() in this case is making a copy, and then reshaping, and it's the intermediate copy that 'owns' the data, q.base.
In short: you cannot always rely on the ndarray.flags['OWNDATA'].
>>> import numpy as np
>>> x = np.random.rand(2,2)
>>> y = x.T
>>> q = y.reshape(4)
>>> y[0,0]
0.86751629121019136
>>> y[0,0] = 1
>>> q
array([ 0.86751629, 0.87671107, 0.65239976, 0.41761267])
>>> x
array([[ 1. , 0.65239976],
[ 0.87671107, 0.41761267]])
>>> y
array([[ 1. , 0.87671107],
[ 0.65239976, 0.41761267]])
>>> y.flags['OWNDATA']
False
>>> x.flags['OWNDATA']
True
>>> q.flags['OWNDATA']
False
>>> np.may_share_memory(x,y)
True
>>> np.may_share_memory(x,q)
False
Because q didn't reflect the change in the first element, like x or y, it must somehow be the owner of the data (somehow is explained below).
There is more discussion about the OWNDATA flag over at the numpy-discussion mailinglist. In the How can I tell if NumPy creates a view or a copy? question, it is briefly mentioned that simply checking the flags.owndata of an ndarray sometimes seems to fail and that it seems unreliable, as you mention. That's because every ndarray also has a base attribute:
the base of an ndarray is a reference to another array if the memory originated elsewhere (otherwise, the base is None). The operation y.reshape(4) creates a copy, not a view, because the strides of y are (8,16). To get it reshaped (C-contiguous) to (4,), the memory pointer would have to jump 0->16->8->24, which is not doable with a single stride. Thus q.base points to the memory location generated by the forced-copy-operation y.reshape, which has the same shape as y, but copied elements and thus has normal strides again: (16, 8). q.base is thus not bound to by any other name as it was the result of the forced-copy operation y.reshape(4). Only now can the object q.base be viewed in a (4,) shape, because the strides allow this. q is then indeed a view on q.base.
For most people it would be confusing to see that q.flags.owndata is False, because, as shown above, it is not a view on y. However, it is a view on a copy of y. That copy, q.base, is the owner of the data however. Thus the flags are actually correct, if you inspect closely.