hdf5

HDF5 - concurrency, compression & I/O performance [closed]

元气小坏坏 提交于 2019-12-17 07:02:19
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 6 years ago . I have the following questions about HDF5 performance and concurrency: Does HDF5 support concurrent write access? Concurrency considerations aside, how is HDF5 performance in terms of I/O performance (does compression rates affect the performance)? Since I use HDF5 with Python, how does its performance compare

Optimal HDF5 dataset chunk shape for reading rows

不羁的心 提交于 2019-12-17 02:49:18
问题 I have a reasonable size (18GB compressed) HDF5 dataset and am looking to optimize reading rows for speed. Shape is (639038, 10000). I will be reading a selection of rows (say ~1000 rows) many times, located across the dataset. So I can't use x:(x+1000) to slice rows. Reading rows from out-of-memory HDF5 is already slow using h5py since I have to pass a sorted list and resort to fancy indexing. Is there a way to avoid fancy indexing, or is there a better chunk shape/size I can use? I have

Pandas HDF5 Select with Where on non natural-named columns

佐手、 提交于 2019-12-14 03:52:58
问题 in my continuing spree of exotic pandas/HDF5 issues, I encountered the following: I have a series of non-natural named columns (nb: because of a good reason, with negative numbers being "system" ids etc), which normally doesn't give an issue: fact_hdf.select('store_0_0', columns=['o', 'a-6', 'm-13']) however, my select statement does fall over it: >>> fact_hdf.select('store_0_0', columns=['o', 'a-6', 'm-13'], where=[('a-6', '=', [0, 25, 28])]) blablabla File "/srv/www/li/venv/local/lib

Visualization of 3-dimensional grid from X_Y_Z(seperate datasets) on Paraview without using xdmf

℡╲_俬逩灬. 提交于 2019-12-14 02:43:56
问题 Reading netcdf files with Paraview using xdmf I used to parse netcdf files with an xdmf script in order to create 3DSMesh on paraview. On top of it, I was adding scalar or vector fields. (So 3DSMesh provides physical coordinates). I never though if it is best way to do that actually. It works, so I was OK. Please let me know if there is more convenient way. I am able to create a 3-dimensional grid with the following script. <?xml version="1.0" ?> <!DOCTYPE Xdmf SYSTEM "Xdmf.dtd" []> <Xdmf

HDF5 headers missing in installation of netCDF4 module for Python

对着背影说爱祢 提交于 2019-12-13 20:09:09
问题 I have attempted to install the netCDF4 module several times now and I keep getting the same error: Traceback (most recent call last): File "<string>", line 17, in <module> File "C:\Users\User\AppData\Local\Temp\pycharm-packaging0.tmp\netCDF4\setup.py", line 219, in <module> raise ValueError('did not find HDF5 headers') ValueError: did not find HDF5 headers I tried using the official HDF installer from their website and I am still getting the same error (though during the installation the

How to use HDF5 in Windows Java project with NetBeans

点点圈 提交于 2019-12-13 19:31:40
问题 I have a simple Java project and I have to write some data to a HDF5 file. I use Netbeans under Windows. Normally, I build libraries from the respective jar-files. So much for my knowledge how to do things ;) I downloaded and installed the binaries from the hdf5 download page. But what comes next? I had a look at HDF5-Java support page but did not get any clue what to do to integrate HDF5 in my JavaApplication. P.S.: I found sis-jhdf5 but I did not get it running as well. I also found some

HDF5 adding numpy arrays slow

徘徊边缘 提交于 2019-12-13 16:29:24
问题 First time using hdf5 so could you help me figure out what is wrong, why adding 3d numpy arrays is slow. Preprocessing takes 3s, adding 3d numpy array (100x512x512) 30s and rising with each sample First I create hdf with: def create_h5(fname_): """ Run only once to create h5 file for dicom images """ f = h5py.File(fname_, 'w', libver='latest') dtype_ = h5py.special_dtype(vlen=bytes) num_samples_train = 1397 num_samples_test = 1595 - 1397 num_slices = 100 f.create_dataset('X_train', (num

How do I change the data type in an HDF5 file from MATLAB?

£可爱£侵袭症+ 提交于 2019-12-13 14:06:01
问题 I have a HDF5 datafile that has an array of int32 data values. I wish to change the data stored in that array to different values which are of a different format (double). for example I can query the data type with the following: finf=h5info('file.hdf5'); finf.Datasets(1).Datatype ans = Name: '' Class: 'H5T_INTEGER' Type: 'H5T_STD_I32LE' Size: 4 Attributes: [] If I try to recreate the data in the same node location it gives me the following error that the data set already exists: t=double

Read HDF5 file into numpy array

孤街浪徒 提交于 2019-12-13 11:32:24
问题 I have the following code to read a hdf5 file as a numpy array: hf = h5py.File('path/to/file', 'r') n1 = hf.get('dataset_name') n2 = np.array(n1) and when I print n2 I get this: Out[15]: array([[<HDF5 object reference>, <HDF5 object reference>, <HDF5 object reference>, <HDF5 object reference>... How can I read the HDF5 object reference to view the data stored in it? 回答1: The easiest thing is to use the .value attribute of the HDF5 dataset. >>> hf = h5py.File('/path/to/file', 'r') >>> data =

Video classification using HDF5 in CAFFE?

天大地大妈咪最大 提交于 2019-12-13 10:26:08
问题 I am using hdf5 layer for video classification (C3D). This is my code to generate hdf5 file import h5py import numpy as np import skvideo.datasets import skvideo.io videodata = skvideo.io.vread('./v_ApplyEyeMakeup_g01_c01.avi') videodata=videodata.transpose(3,0,1,2) # To chanelxdepthxhxw videodata=videodata[None,:,:,:] with h5py.File('./data.h5','w') as f: f['data'] = videodata f['label'] = 1 Now, the data.h5 is saved in the file video.list . I perform the classification based on the prototxt