Is it possible to update dataset dimensions in hdf5 file using rhdf5 in R?

妖精的绣舞 提交于 2019-12-11 21:13:48

问题


I am trying to update 7 datasets within 1 group in an hdf5 file, but the updated datasets have different size dimensions than the originals (but the same dimensionality, ie 1D, 2D, and 3D). Is there a way to alter the dimension property in order to update the dataset? Alternatively, can I delete the previous group, and then create a new group in it's place? I'd rather not rebuild the entire h5 file (create file, create groups, create datasets) since it's decently complex.

I am using the Bioconductor rhdf5 package in R.

Example data:

# load package from bioconductor
source("http://bioconductor.org/biocLite.R")
biocLite("rhdf5")
library(rhdf5)

# create new h5 file and populate
created = h5createFile('example.h5')
created = h5createGroup('example.h5','foo')
h5write(matrix(1:10, nr=5, nc=2), 'example.h5', 'foo/A')

# updating dataset with data of same dimension is successful
h5write(matrix(11:20, nr= 5, nc = 2), 'example.h5', 'foo/A') 

# updating dataset with data of different dimension fails
h5write(matrix(1:12, nr= 6, nc = 2), 'example.h5', 'foo/A')

Note: I've read data from hdf5 files in the past, but this is my first time writing data back out into the file, so perhaps this is a naive expectation.


回答1:


Unfortunately, the maximum size of an HDF5 dataset is fixed when it is created, and can't be increased afterwards. You're going to have to recreate at least the datasets you want to extend.

HDF5 does allow you to "delete" a dataset, but this only involves unlinking it, i.e. it becomes inaccessible, but the space is not reclaimed. rhdf5 doesn't seem to provide an interface to this, however. Someone more familiar with rhdf5 may be able to help you there.

You can set the maximum size in in rhdf5 with

h5createDataset('example.h5', 'foo/A', c(10), maxdims=c(12))

from the rhdf5 reference manual (PDF). If you want an unlimited maxdims, it's a bit more involved: you first have to create a dataspace using HDF5 constants and use that to create your dataset.




回答2:


Note that if you want to change the dataset to a smaller dataset, then that is possible. You can use the function "h5set_extent" from version 2.11.4 on.see documentation



来源:https://stackoverflow.com/questions/25752873/is-it-possible-to-update-dataset-dimensions-in-hdf5-file-using-rhdf5-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!