Listing datasets in a group in HDF5

瘦欲@ 提交于 2019-12-12 19:42:08

问题


I decided to store my data in HDF5 using its hierarchical structure instead of relying on the filesystem. Unfortunately, I'm having performance issues.

My data is formatted as follows: I have about 70 top level groups, corresponding to dates and each of them contain roughly 8000 datasets. I would like to see a list of the number of datasets per day:

for date in hdf5.keys():
   print(len(hdf5[date]))

I'm finding it a little frustrating that this takes 2+ second/iteration.

Also, I have two different hdf5 files with the above layout and the bigger one is much slower at this.

What am I doing wrong?


回答1:


Try creating the file with the libver latest flag:

f = h5py.File('name.hdf5', libver='latest') 

This will be much faster if you have a lot of datasets per group or attributes per dataset.



来源:https://stackoverflow.com/questions/35953404/listing-datasets-in-a-group-in-hdf5

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!