Resizing and storing dataset in .h5 format using h5py in python

后端 未结 2 1700
逝去的感伤
逝去的感伤 2020-12-21 11:32

I am trying to resize dataset and store new values using h5py package in python. My dataset size keeps increasing at every time instance, and I would like to ap

2条回答
  •  失恋的感觉
    2020-12-21 12:00

    @tel provided an elegant solution to the problem. I outlined a simpler approach in my comments below his answer. It is simpler for a beginner to code (and understand). Basically, it there a few minor changes to @Maxtron's original code. Modifications are:

    • move with h5py.File(path, "a") as hf: to __main__ routine
    • pass hf in create_h5py(hf)
    • I also added a test before os.remove() to avoid errors if the h5 file doesn't exist

    My suggested modifications below:

    import h5py, os
    import numpy as np
    
    path = './out.h5'
    # test existence of H5 file before deleting
    if  os.path.isfile(path):
        os.remove(path)
    
    def create_h5py(hf):
        grp = hf.create_group('left')
        dset = []
        dset.append(grp.create_dataset('voltage', (10**4,3), maxshape=(None,3), dtype='f', chunks=(10**4,3)))
        dset.append(grp.create_dataset('current', (10**4,3), maxshape=(None,3), dtype='f', chunks=(10**4,3)))
        return dset
    
    if __name__ == '__main__':
    
        with h5py.File(path, "a") as hf:
            dset = create_h5py(hf)
            for i in range(3):
    
                if i == 0:
                    dset[0][:] = np.random.random(dset[0].shape) 
                    dset[1][:] = np.random.random(dset[1].shape)
                else:
                    dset[0].resize(dset[0].shape[0]+10**4, axis=0)
                    dset[0][-10**4:] = np.random.random((10**4,3))
                    dset[1].resize(dset[1].shape[0]+10**4, axis=0)
                    dset[1][-10**4:] = np.random.random((10**4,3))
    

提交回复
热议问题