Python numpy recarray: Can one obtain a view into different fields using pointer arithmetic?

你。 提交于 2019-12-24 04:03:45

问题


I have a numpy structured array of the following form:

x = np.array([(1,2,3)]*2, [('t', np.int16), ('x', np.int8), ('y', np.int8)])

I now want to generate views into this array that team up 't' with either 'x' or 'y'. The usual syntax creates a copy:

v_copy = x[['t', 'y']]
v_copy
#array([(1, 3), (1, 3)], 
#     dtype=[('t', '<i2'), ('y', '|i1')])

v_copy.base is None
#True

This is not unexpected, since picking two fields is "fancy indexing", at which point numpy gives up and makes a copy. Since my actual records are large, I want to avoid the copy at all costs.

It is not at all true that the required elements cannot be accessed within numpy's strided memory model. Looking at the individual bytes in memory:

x.view(np.int8)
#array([1, 0, 2, 3, 1, 0, 2, 3], dtype=int8)

one can figure out the necessary strides:

v = np.recarray((2,2), [('b', np.int8)], buf=x, strides=(4,3))
v
#rec.array([[(1,), (3,)],
#    [(1,), (3,)]], 
#    dtype=[('b', '|i1')])
v.base is x
#True

Clearly, v points to the correct locations in memory without having created a copy. Unfortunately, numpy won't allow me to reinterpret these memory locations as the original data types:

v_view = v.view([('t', np.int16), ('y', np.int8)])
#ValueError: new type not compatible with array.

Is there a way to trick numpy into doing this cast, so that an array v_view equivalent to v_copy is created, but without having made a copy? Perhaps working directly on v.__array_interface__, as is done in np.lib.stride_tricks.as_strided()?


回答1:


You can construct a suitable dtype like so

dt2 = np.dtype(dict(names=('t', 'x'), formats=(np.int16, np.int8), offsets=(0, 2)))

and then do

y = np.recarray(x.shape, buf=x, strides=x.strides, dtype=dt2)

In future Numpy versions (> 1.6), you can also do

dt2 = np.dtype(dict(names=('t', 'x'), formats=(np.int16, np.int8), offsets=(0, 2), itemsize=4))
y = x.view(dt2)



回答2:


This works with numpy 1.6.x and avoids creating a recarray:

dt2 = {'t': (np.int16, 0), 'y': (np.int8, 3)}
v_view = np.ndarray(x.shape, dtype=dt2, buffer=x, strides=x.strides)
v_view
#array([(1, 3), (1, 3)], 
#    dtype=[('t', '<i2'), ('', '|V1'), ('y', '|i1')])
v_view.base is x
#True

One can wrap this in a class overloading np.ndarray:

class arrayview(np.ndarray):
    def __new__(subtype, x, fields):
        dtype = {f: x.dtype.fields[f] for f in fields}
        return np.ndarray.__new__(subtype, x.shape, dtype,
                                  buffer=x, strides=x.strides)

v_view = arrayview(x, ('t', 'y'))
v_view
#arrayview([(1, 3), (1, 3)], 
#    dtype=[('t', '<i2'), ('', '|V1'), ('y', '|i1')])
v_view.base is x
#True


来源:https://stackoverflow.com/questions/11774168/python-numpy-recarray-can-one-obtain-a-view-into-different-fields-using-pointer

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!