Writing to compound dataset with variable length string via h5py (HDF5)

前端 未结 1 837
别那么骄傲
别那么骄傲 2020-12-22 01:00

I\'ve been able to create a compound dataset consisting of an unsigned int and a variable-length string in my HDF5 file using h5py, but I can\'t write to it.



        
相关标签:
1条回答
  • 2020-12-22 01:32

    Following on my earlier answer to Inexplicable behavior when using vlen with h5py

    I ran this test (h5py version '2.2.1'):

    In [4]: import h5py
    In [5]: dt = h5py.special_dtype(vlen=str)
    In [6]: f=h5py.File('foo.hdf5')
    In [8]: ds1 = f.create_dataset('JustStrings',(10,), dtype=dt)
    In [10]: ds1[0]='string'
    In [11]: ds1[1]='a longer string'
    In [13]: ds1[2:5]='one_string two_strings three'.split()
    
    In [14]: ds1
    Out[14]: <HDF5 dataset "JustStrings": shape (10,), type "|O4">
    
    In [15]: ds1.value
    Out[15]: 
    array(['string', 'a longer string', 'one_string', 'two_strings', 'three',
           '', '', '', '', ''], dtype=object)
    

    And for a mixed dtype like yours:

    In [16]: ds2 = f.create_dataset('IntandStrings',(10,),
       dtype=np.dtype([("number",int),('astring',dt)]))
    In [17]: ds2[0]=(1,'astring')
    In [18]: ds2[1]=(10,'a longer string')
    In [19]: ds2[2:4]=[(10,'a longer much string'),(0,'')]
    In [20]: ds2.value
    Out[20]: 
    array([(1, 'astring'), (10, 'a longer string'),
           (10, 'a longer much string'), (0, ''), (0, ''), (0, ''), (0, ''),
           (0, ''), (0, ''), (0, '')], 
          dtype=[('number', '<i4'), ('astring', 'O')])
    

    Trying to set a field by itself does not seem to work

    ds2['astring'][4]='one two three four'
    

    Instead I have to set the whole record:

    ds2[4]=(123,'one two three four')
    

    Trying to set the whole field produces the same error:

    ds2['astring']='astring'
    

    I initialed this dataset to (10,), while yours is (1,). But I think it's the same problem.

    I can, though, set the whole numeric field:

    In [48]: ds2['number']=np.arange(10)
    In [50]: ds2['number']
    Out[50]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
    In [51]: ds2.value
    Out[51]: 
    array([(0, 'astring'), (1, 'a longer string'), 
           (2, 'a longer much string'),
           (3, ''), (4, 'one two three four'), (5, ''), 
           (6, ''), (7, ''),
           (8, ''), (9, '')], 
          dtype=[('number', '<i4'), ('astring', 'O')])
    
    0 讨论(0)
提交回复
热议问题