Compression of existing file using h5py

时光总嘲笑我的痴心妄想 提交于 2021-02-08 13:15:43

问题


I'm currently working on a project regarding compression of HDF5 datasets and recently began using h5py. I followed the basic tutorials and was able to open,create and compress a file while it was being created. However, I've been unsuccessful when it comes to compressing an existing file (which is the aim of my work).

I've tried opening files using 'r+' and then compressing chunked datasets but the file sizes have remained the same.

Any suggestions on what commands to use or am I going about things the wrong way?


回答1:


HDF group provides a set of tools to convert, display, analyze and edit and repack your HDF5 file.

You can compress the existing hdf5 file using the h5repack utility. You can also change the chunk size using the same utility.

h5repack can used from command line.

h5repack file1 file2 //removes the accounted space of file 1 and saves it as file2.

h5repack -v -l CHUNK=1024 file1 file2 //Applies chunking of 1024 to the file1

h5repack -v -l CHUNK=1024 GZIP=5 file1 file2 //makes chunks of 1024 and compresses it with GZIP level 5 compression

h5repack --help \gets avalable help documentation

Detailed documentation is also available.




回答2:


Compression is very easy to use in h5py. Check out the Wiki HowTo and Compression guides. Basically, it would be something like:

ds = myfile.create_dataset('ds', shape, dtype, compression='lzf')

There is also some issues with how you pick chunk sizes to optimize file size/access, see the Compression guide I linked to.

I do not remember which compression, if any, is on by default.



来源:https://stackoverflow.com/questions/15903867/compression-of-existing-file-using-h5py

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!