How to keep numpy from broadcasting when creating an object array of different shaped arrays

被刻印的时光 ゝ 提交于 2019-12-17 20:23:44

问题


I try to store a list of different shaped arrays as a dtype=object array using np.save (I'm aware I could just pickle the list but I'm really curious how to do this). If I do this:

import numpy as np
np.save('test.npy', [np.zeros((2, 2)), np.zeros((3,3))])

it works. But this:

np.save('test.npy', [np.zeros((2, 2)), np.zeros((2,3))])

Gives me an error:

ValueError: could not broadcast input array from shape (2,2) into shape (2)

I guess np.save converts the list into an array first, so I tried:

x=np.array([np.zeros((2, 2)), np.zeros((3,3))])
y=np.array([np.zeros((2, 2)), np.zeros((2,3))])

Which has the same effect (first one works, second one doesn't. The resulting x behaves as expected:

>>> x.shape
(2,)
>>> x.dtype
dtype('O')
>>> x[0].shape
(2, 2)
>>> x[0].dtype
dtype('float64')

I also tried to force the 'object' dtype:

np.array([np.zeros((2, 2)), np.zeros((2,3))], dtype=object)

Without success. It seems numpy tries to broadcast the array with equal first dimension into the new array and realizes too late that their shape is different. Oddly it seems to have worked at one point - so I'm really curious what the difference is, and how to do this properly.


EDIT: I figured out the case it worked before: The only difference seems to be that the numpy arrays in the list have another data type. It works with dtype('<f8'), but it doesn't with dtype('float64'), I'm not even sure what the difference is.


EDIT 2: I found a very non-pythonic way to solve my issue, I add it here, maybe it helps to understand what I wanted to do:

array_list=np.array([np.zeros((2, 2)), np.zeros((2,3))])
save_array = np.empty((len(array_list),), dtype=object)
for idx, arr in enumerate(array_list):
    save_array[idx] = arr
np.save('test.npy', save_array)

回答1:


One of the first things that np.save does is

arr = np.asanyarray(arr)

So yes it is trying to turn your list into an array.

Constructing an object array from arbitrary sized arrays or lists is tricky. np.array(...) tries to create as high a dimensional array as it can, even attempting to concatenate the inputs if possible. The surest way is to do what you did - make the empty array and fill it.

A slightly more compact way of constructing the object array:

In [21]: alist = [np.zeros((2, 2)), np.zeros((2,3))]
In [22]: arr = np.empty(len(alist), dtype=object)
In [23]: arr[:] = alist
In [24]: arr
Out[24]: 
array([array([[ 0.,  0.],
       [ 0.,  0.]]),
       array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])], dtype=object)

Here are 3 scenarios:

Arrays that match in shape, combine into a 3d array:

In [27]: np.array([np.zeros((2, 2)), np.zeros((2,2))])
Out[27]: 
array([[[ 0.,  0.],
        [ 0.,  0.]],

       [[ 0.,  0.],
        [ 0.,  0.]]])
In [28]: _.shape
Out[28]: (2, 2, 2)

Arrays that don't match on the first dimension - create object array

In [29]: np.array([np.zeros((2, 2)), np.zeros((3,2))])
Out[29]: 
array([array([[ 0.,  0.],
       [ 0.,  0.]]),
       array([[ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.]])], dtype=object)
In [30]: _.shape
Out[30]: (2,)

And awkward intermediate case (which may even be described as a bug). The first dimensions match, but the second ones don't):

In [31]: np.array([np.zeros((2, 2)), np.zeros((2,3))])
...
ValueError: could not broadcast input array from shape (2,2) into shape (2)
       [ 0.,  0.]])], dtype=object)

It's as though it initialized a (2,2,2) array, and then found that the (2,3) wouldn't fit. And the current logic doesn't allow it to backup and create the object array as it did in the previous scenario.

If you wanted to put the two (2,2) arrays in object array you'd have to use the create and fill logic.



来源:https://stackoverflow.com/questions/43173540/how-to-keep-numpy-from-broadcasting-when-creating-an-object-array-of-different-s

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!