How to compress the data that saved in hdf5?

ぐ巨炮叔叔 提交于 2019-12-11 17:56:39

问题


I am using python 2.7 to read a video and store in hdf5. This is my code

import h5py
import skvideo.datasets
import skvideo.io
videodata = skvideo.io.vread('./v_ApplyEyeMakeup_g01_c01.avi')
with h5py.File('./video.hdf5','w') as f:
    f['data'] = videodata
    f['label'] = 1

The problem is that the output hdf5 is too larger. It is 128 times larger than the original avi file. What should I do to compress/reduce the size? You can download the file at https://drive.google.com/open?id=0B1MrjZsURl2yNFM0ZTJfZ3pOZVU

I think we can compress it by using

f.create_dataset('data',data=videodata,compression='gzip',compression_opts=9)
f.create_dataset('label', data=1)

Now, it still 37 times larger than the original file. Thanks in advance.


回答1:


Your problem should be solved using a suitable encode for your video file. Based on your business, there are various encoding algorithms for example there is x265 which will compress the video but requires high resource to do that. Take a look here.

Recently I have heard about another interesting encode which is good for online streaming called Daala you can get more information here.

Generally it depends on what you expect from the encoding, but choosing a good encoder is the way you should go, try search for that.




回答2:


By adding chunking I was able to make the output 7.2M compared to 10M without. So it definitely improves, but still far from dedicated video formats. You may play with other filters from https://support.hdfgroup.org/services/filters.html but I doubt they will improve the compression by an order of magnitude. So if you want to continue with h5py, you probably need to accept larger file size. In case this is not acceptable, just try another file format.

import h5py
import skvideo.datasets
import skvideo.io
videodata = skvideo.io.vread('./v_ApplyEyeMakeup_g01_c01.avi')

print(videodata.shape)
with h5py.File('./video.hdf5','w') as f:
    f.create_dataset('data',
                      data=videodata,
                      compression='gzip',
                      compression_opts=9,
                      chunks=(164, 20, 20, 3))
    f.create_dataset('label', data=1)


来源:https://stackoverflow.com/questions/46278714/how-to-compress-the-data-that-saved-in-hdf5

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!