How to append data to one specific dataset in a hdf5 file with h5py

后端 未结 1 408
后悔当初
后悔当初 2020-12-04 17:56

I am looking for a possibility to append data to an existing dataset inside a .h5 file using Python (h5py).

A short intro to my project: I

相关标签:
1条回答
  • 2020-12-04 18:31

    I have found a solution that seems to work!

    Have a look at this: incremental writes to hdf5 with h5py!

    In order to append data to a specific dataset it is necessary to first resize the specific dataset in the corresponding axis and subsequently append the new data at the end of the "old" nparray.

    Thus, the solution looks like this:

    with h5py.File('.\PreprocessedData.h5', 'a') as hf:
        hf["X_train"].resize((hf["X_train"].shape[0] + X_train_data.shape[0]), axis = 0)
        hf["X_train"][-X_train_data.shape[0]:] = X_train_data
    
        hf["X_test"].resize((hf["X_test"].shape[0] + X_test_data.shape[0]), axis = 0)
        hf["X_test"][-X_test_data.shape[0]:] = X_test_data
    
        hf["Y_train"].resize((hf["Y_train"].shape[0] + Y_train_data.shape[0]), axis = 0)
        hf["Y_train"][-Y_train_data.shape[0]:] = Y_train_data
    
        hf["Y_test"].resize((hf["Y_test"].shape[0] + Y_test_data.shape[0]), axis = 0)
        hf["Y_test"][-Y_test_data.shape[0]:] = Y_test_data
    

    However, note that you should create the dataset with maxshape=(None,), for example

    h5f.create_dataset('X_train', data=orig_data, compression="gzip", chunks=True, maxshape=(None,)) 
    

    otherwise the dataset cannot be extended.

    0 讨论(0)
提交回复
热议问题