h5py returning unexpected results in indexing

旧时模样 提交于 2019-12-23 05:12:33

问题


I'm attempting to fill an h5py dataset with a series of numpy arrays that I generate in sequence so my memory can handle it.

The h5py array is initialised so that the first dimension can have any magnitude,

f.create_dataset('x-data', (1, maxlen, 50), maxshape=(None, maxlen, 50))

After generating each numpy array X, I am using

f['x-data'][alen:alen + len(data),:,:] = X

Where for example, in the first array, alen=0 and len(data)=10056. I then increment alen so the next array will start from where the last one ended.

print f['x-data'][alen:alen + len(data),:,:].shape, alen, len(data)

(1L, 60L, 50L) 0 10056

Does anyone know why the 0:10056 indexing is being interpreted as 1L?


回答1:


I replicated your example, but on a much smaller scale. I had to do a resize each time I added elements, e.g.

f['xdata'].resize(50,axis=0)

The first time I tried to add a block I got an error:

TypeError: Can't broadcast (10, 20, 10) -> (1, 20, 10)

But subsequent times, when I'd outgrown the allocated space, it failed silently. No error, it just didn't end up storing the new values.

This is for version 2.2.1




回答2:


I found the answer from a helpful person on the user group.

The maxshape(None) feature does not mean that the dataset automatically resizes - it must be resized each time new input is added. So the first dimension must be increased before adding new data:

    x.resize((x.shape[0] + X.shape[0], X.shape[1], X.shape[2]))
    y.resize((y.shape[0] + Y.shape[0], Y.shape[1], Y.shape[2]))

The dataset then adds the values correctly.



来源:https://stackoverflow.com/questions/32258222/h5py-returning-unexpected-results-in-indexing

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!