h5py : how to rename dimensions?

拟墨画扇 提交于 2019-12-25 03:48:29

问题


I created a new file whose handle is fw.

fw.create_dataset('grp1/varname',data=arr)

The groups are created before this command. arr is a numpy array with dimensions (2,3). The file is created successfully. However, the dimensions are named phony_0, and phony_1. How do I change them to say m and n ?

In general how does one create dimensions within a group and then associate variables with them?

I tried,

fw['grp1/varname'].dims[0].label = 'm'

But this does not have the desired effect.

ncdump -h on the created file shows :

group: grp1 {

        dimensions:
                phony_dim_0 = 2 ;
                phony_dim_1 = 3 ;

        variables:

                float varname(phony_dim_0, phony_dim_1) ;
                        string varname:DIMENSION_LABELS = "m", NIL, NIL ;
        } // group grp1

Thanks

print([ dim.label for dim in fw['grp1/varname'].dims]) does produce consistent output. [u'm', u'']

It does seem that hdffiles do not have provision to associate dimensions with groups. However varname is a variable. How does one get :

   variables:
            float varname(m, phony_dim_1) ;
                    string varname:DIMENSION_LABELS = "m", NIL ;
    } // group grp1

in the output of ncdump -h or h5dump ? I did try different options with h5dump.

Thanks.


回答1:


Part of the problem may your use of ncdump.

I can make a simple file, and set the dims label for a dataset:

In [420]: import h5py
In [421]: f = h5py.File('testdim.h5','w')
In [422]: ds = f.create_dataset('grp1/varname', data = np.arange(10))
In [423]: ds
Out[423]: <HDF5 dataset "varname": shape (10,), type "<i8">

Look at the dims attribute:

In [424]: ds.dims
Out[424]: <Dimensions of HDF5 object at 140382697336904>
In [426]: ds.dims[0]
Out[426]: <"" dimension 0 of HDF5 dataset at 140382697336904>
In [427]: ds.dims[0].label
Out[427]: ''
In [428]: ds.dims[0].label = 'm'

In [436]: dd=ds.dims[0]
In [437]: dd?
Type:        DimensionProxy
String form: <"m" dimension 0 of HDF5 dataset at 140382697336904>
Length:      0
File:        ~/.local/lib/python3.6/site-packages/h5py/_hl/dims.py
Docstring:   Represents an HDF5 "dimension".
In [439]: dd.values()
Out[439]: []
In [440]: dd.label
Out[440]: 'm'

The group does not have a dims:

In [442]: g = f['grp1']
In [443]: g
Out[443]: <HDF5 group "/grp1" (1 members)>
In [444]: g.dims
AttributeError: 'Group' object has no attribute 'dims'

In [446]: f.flush()

With h5dump:

1902:~/mypy$ h5dump testdim.h5 
HDF5 "testdim.h5" {
GROUP "/" {
   GROUP "grp1" {
      DATASET "varname" {
         DATATYPE  H5T_STD_I64LE
         DATASPACE  SIMPLE { ( 10 ) / ( 10 ) }
         DATA {
         (0): 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
         }
         ATTRIBUTE "DIMENSION_LABELS" {
            DATATYPE  H5T_STRING {
               STRSIZE H5T_VARIABLE;
               STRPAD H5T_STR_NULLTERM;
               CSET H5T_CSET_ASCII;
               CTYPE H5T_C_S1;
            }
            DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
            DATA {
            (0): "m"
            }
         }
      }
   }
}
}

With ncdump which is designed to show netcdf files:

1902:~/mypy$ ncdump -h testdim.h5 
netcdf testdim {

group: grp1 {
  dimensions:
    phony_dim_0 = 10 ;
  variables:
    int64 varname(phony_dim_0) ;
        string varname:DIMENSION_LABELS = "m" ;
  } // group grp1
}

As best I can the h5df format does not have group dimensions; ncdump creates a dummy attribute for that.

To reiterate the answer to your previous question, the documentation of HDF5 dimensions is:

http://docs.h5py.org/en/latest/high/dims.html

https://www.unidata.ucar.edu/software/netcdf/docs/interoperability_hdf5.html

For HDF5 file

If dimension scales are not used, then netCDF-4 can still edit the file, and will invent anonymous dimensions for each variable shape.

NETCDF has shared dimensions, HDF5 has dimension scales. They aren't quite the same.

http://www.stcorp.nl/beat/documentation/harp/conventions/hdf5.html

In the HDF5 data model there is no concept of shared dimensions (unlike netCDF). The shape of an HDF5 dataset is specified as a list of dimension lengths. However, the netCDF-4 library uses HDF5 as its storage backend. It represents shared dimensions using HDF5 dimension scales.




回答2:


Your output says that the first dimension label of the varname dataset is "m". Those phony_dim_N labels just hold the actual size of each dimension, they aren't the labels.

What's the output of print([dim.label for dim in fw['grp1/varname'].dims])?



来源:https://stackoverflow.com/questions/52978389/h5py-how-to-rename-dimensions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!