h5py

Open .h5 file in Python

我的梦境 提交于 2020-01-14 08:04:09
问题 I am trying to read a h5 file in Python. The file can be found in this link and it is called 'vstoxx_data_31032014.h5'. The code I am trying to run is from the book Python for Finance, by Yves Hilpisch and goes like this: import pandas as pd h5 = pd.HDFStore('path.../vstoxx_data_31032014.h5', 'r') futures_data = h5['futures_data'] # VSTOXX futures data options_data = h5['options_data'] # VSTOXX call option data h5.close() I am getting the following error: h5 = pd.HDFStore('path.../vstoxx_data

Optimising HDF5 dataset for Read/Write speed

倾然丶 夕夏残阳落幕 提交于 2020-01-14 06:04:06
问题 I'm currently running an experiment where I scan a target spatially and grab an oscilloscope trace at each discrete pixel. Generally my trace lengths are 200Kpts. After scanning the entire target I assemble these time domain signals spatially and essentially play back a movie of what was scanned. My scan area is 330x220 pixels in size so the entire dataset is larger than RAM on the computer I have to use. To start with I was just saving each oscilloscope trace as a numpy array and then after

Optimising HDF5 dataset for Read/Write speed

孤街醉人 提交于 2020-01-14 06:03:31
问题 I'm currently running an experiment where I scan a target spatially and grab an oscilloscope trace at each discrete pixel. Generally my trace lengths are 200Kpts. After scanning the entire target I assemble these time domain signals spatially and essentially play back a movie of what was scanned. My scan area is 330x220 pixels in size so the entire dataset is larger than RAM on the computer I have to use. To start with I was just saving each oscilloscope trace as a numpy array and then after

is it possible to use np arrays as indices in h5py datasets?

守給你的承諾、 提交于 2020-01-11 14:07:09
问题 I need to merge a number of datasets, each contained in a separate file, into another dataset belonging to a final file. The order of the data in the partial dataset is not preserved when they get copied in the final one - the data in the partial datasets is 'mapped' into the final one through indices. I created two lists, final_indices and partial_indices, and wrote: final_dataset = final_hdf5file['dataset'] partial_dataset = partial_hdf5file['dataset'] # here partial ad final_indices are

Python-created HDF5 dataset transposed in Matlab

浪子不回头ぞ 提交于 2020-01-11 08:26:09
问题 I have some data that I share between Python and Matlab. I used to do it by saving NumPy arrays in MATLAB-style .mat files but would like to switch to HDF5 datasets. However, I've noticed a funny feature: when I save a NumPy array in an HDF5 file (using h5py) and then read it in Matlab (using h5read), it ends up being transposed. Is there something I'm missing? Python code: import numpy as np import h5py mystuff = np.random.rand(10,30) f = h5py.File('/home/user/test.h5', 'w') f['mydataset'] =

Getting multiple datasets from group in HDF5

我只是一个虾纸丫 提交于 2020-01-07 03:05:52
问题 I am comparing two different hdf5 files to make sure that they match. I want to create a list with all of the datasets in the group in the hdf5 file so that I can have a loop run through all of the datasets, instead of entering them manually. I cant seem to find away to do this. Currently I am getting the data set by using this code: tdata21 = ft['/PACKET_0/0xeda9_data_0004'] The names of the sets are located in the "PACKET_0" group. Once I arrange all of the datasets, I compare the data in

Opening a mat file using h5py and convert data into a numpy matrix

允我心安 提交于 2020-01-06 19:52:28
问题 I have a mat file which contains 2 different cells containing matrices of different size. I need to convert that data into a numpy array using h5py for an experiment (I'm new in h5py. I thought it was as easy as it is explained here Reading the file works well, putting the data in the numpy array also works well, but I need the value representation of each position inside each matrix inside each cell, taking into account that when I print for example np.array(x[0][1]) , I receive just the

How can I combine multiple .h5 file?

倾然丶 夕夏残阳落幕 提交于 2020-01-06 08:24:29
问题 Everything that is available online is too complicated. My database is large to I exported it in parts. I now have three .h5 file and I would like to combine them into one .h5 file for further work. How can I do it? 回答1: There are at least 3 ways to combine data from individual HDF5 files into a single file: Use external links to create a new file that points to the data in your other files (requires pytables/tables module) Copy the data with the HDF Group utility: h5copy.exe Copy the data

How can I loop over HDF5 groups in Python removing rows according to a mask?

不羁岁月 提交于 2020-01-05 04:05:40
问题 I have an HDF5 file containing a number of different groups all of which have the same number of rows. I also have a Boolean mask for rows to keep or remove. I would like to iterate over all groups in the HDF5 file removing rows according to the mask. The recommended method to recursively visit all groups is visit(callable) , but I can't work out how to pass my mask to the callable. Here is some code hopefully demonstrating what I would like to do but which doesn't work: def apply_mask(name,

Why do pickle + gzip outperform h5py on repetitive datasets?

故事扮演 提交于 2020-01-05 03:04:43
问题 I am saving a numpy array which contains repetitive data: import numpy as np import gzip import cPickle as pkl import h5py a = np.random.randn(100000, 10) b = np.hstack( [a[cnt:a.shape[0]-10+cnt+1] for cnt in range(10)] ) f_pkl_gz = gzip.open('noise.pkl.gz', 'w') pkl.dump(b, f_pkl_gz, protocol = pkl.HIGHEST_PROTOCOL) f_pkl_gz.close() f_pkl = open('noise.pkl', 'w') pkl.dump(b, f_pkl, protocol = pkl.HIGHEST_PROTOCOL) f_pkl.close() f_hdf5 = h5py.File('noise.hdf5', 'w') f_hdf5.create_dataset('b',