When subclassing ndarray why does a transpose happen after __array_finalize__ and not before?

问题

Let us for simplicity just copy the diagnostic ndarray subclass from the numpy docs:

import numpy as np

class MySubClass(np.ndarray):

    def __new__(cls, input_array, info=None):
        obj = np.asarray(input_array).view(cls)
        obj.info = info
        return obj

    def __array_finalize__(self, obj):
        print('In __array_finalize__:')
        print('   self is %s' % repr(self))
        print('   obj is %s' % repr(obj))
        if obj is None: return
        self.info = getattr(obj, 'info', None)

Now let's do a simple example:

>>> x = MySubClass(np.ones((1,5)))
In __array_finalize__:
   self is MySubClass([[1., 1., 1., 1., 1.]])
   obj is array([[1., 1., 1., 1., 1.]])
>>> y = x.T
In __array_finalize__:
   self is MySubClass([[1., 1., 1., 1., 1.]])
   obj is MySubClass([[1., 1., 1., 1., 1.]])

As we can see something that clearly isn't the transpose is passed to __array_finalize__. Apart from stretching the meaning of the word "finalize" to entire new realms, what is the purpose of this behavior?

Wouldn't it make more sense to send the actual output, i.e. the transpose through this hook for it to be finalized?

What is the recommended way to touch up the base transpose with whatever postprocessing my subclass might require?

回答1:

That's because for creating the new object they rely on the already available (common) function PyArray_NewFromDescrAndBase to handle the memory allocation. The source code of PyArray_Transpose reveals that first the new object is created from the existing array, with similar shape and strides, and then those are corrected by accessing the previously allocated memory:

/*
 * this allocates memory for dimensions and strides (but fills them
 * incorrectly), sets up descr, and points data at PyArray_DATA(ap).
 */
Py_INCREF(PyArray_DESCR(ap));
ret = (PyArrayObject *) PyArray_NewFromDescrAndBase(
        Py_TYPE(ap), PyArray_DESCR(ap),
        n, PyArray_DIMS(ap), NULL, PyArray_DATA(ap),
        flags, (PyObject *)ap, (PyObject *)ap);
if (ret == NULL) {
    return NULL;
}

/* fix the dimensions and strides of the return-array */
for (i = 0; i < n; i++) {
    PyArray_DIMS(ret)[i] = PyArray_DIMS(ap)[permutation[i]];
    PyArray_STRIDES(ret)[i] = PyArray_STRIDES(ap)[permutation[i]];
}

Here PyArray_NewFromDescrAndBase is responsible for invoking __array_finalize__ and hence that method receives the version with incorrect shape and strides (i.e. non-transposed). It would be possible to do differently but it would require an extra parameter for PyArray_NewFromDescrAndBase to defer the call to __array_finalize__ and then it could be done manually after the shape and strides have been adjusted.

来源：https://stackoverflow.com/questions/60714489/when-subclassing-ndarray-why-does-a-transpose-happen-after-array-finalize-an

标签

python

numpy

subclassing