hdf5

HDF5 C++ with third party filters

半腔热情 提交于 2019-12-13 06:34:25
问题 I am trying to write C++ code to create an HDF5 dataset with third-party filters listed here: "https://support.hdfgroup.org/services/contributions.html". I created a snappy filter function that can compress as well as decompress the data using the snappy library functions. I was able to write with snappy filter and read from it without any problem. However, when I try to read the data through h5dump, I am not getting any output even though I am using the correct filter ID (32003 for snappy).

How to retrieve pandas df multiindex from HDFStore?

情到浓时终转凉″ 提交于 2019-12-13 05:06:22
问题 If DataFrame with simple index is the case, one may retrieve index from HDFStore as follows: df = pd.DataFrame(np.random.randn(2, 3), index=list('yz'), columns=list('abc')) df >>> a b c >>> y -0.181063 1.919440 1.550992 >>> z -0.701797 1.917156 0.645707 with pd.HDFStore('test.h5') as store: store.put('df', df, format='t') store.select_column('df', 'index') >>> 0 y >>> 1 z >>> Name: index, dtype: object As stated in the docs. But in case with MultiIndex such trick doesn't work: df = pd

Reading hdf5 into c++ with memory problems

[亡魂溺海] 提交于 2019-12-13 04:38:38
问题 I am rewriting a code I had developed in python into c++ mainly for an improvement in speed; while also hoping to gain more experience in this language. I also plan on using openMP to parallelize this code onto 48 cores which share 204GB of memory. The program I am writing is simple, I import an hdf5 file which is 3D : A[T][X][E], where T is associated to each timestep from a simulation, X represents where the field is measured, and E(0:2) represents the electric field in x,y,z. Each element

HDF5 core driver (H5FD_CORE): loading selected dataset(s)

心已入冬 提交于 2019-12-13 03:16:51
问题 Currently, I load HDF5 data in python via h5py and read a dataset into memory. f = h5py.File('myfile.h5', 'r') dset = f['mydataset'][:] This works, but if 'mydataset' is the only dataset in myfile.h5, then the following is more efficient: f = h5py.File('myfile.h5', 'r', driver='core') dset = f['mydataset'][:] I believe this is because the 'core' driver memory maps the entire file, which is an optimised way of loading data into memory. My question is: is it possible to use 'core' driver on

how to save h5py arrays with different sizes?

▼魔方 西西 提交于 2019-12-13 03:15:48
问题 I am referring this question to this. I am making this new thread because I did not really understand the answer given there and hopefully there is someone who could explain it more to me. Basically my problem is like in the link there.Before, I use np.vstack and create h5 format file from it. Below are my example: import numpy as np import h5py import glob path="/home/ling/test/" def runtest(): data1 = [np.loadtxt(file) for file in glob.glob(path + "data1/*.csv")] data2 = [np.loadtxt(file)

Convert hdf5 to raw organised in folders

让人想犯罪 __ 提交于 2019-12-13 02:41:19
问题 I use a script to make images match with an atlas. This script input is .raw images organised in folders like: imageFolder -- folder1 ---- image1.raw ---- image2.raw -- folder2 ---- image1.raw ---- image2.raw I have an image in hdf5 and I would like to convert it into multiple files as presented before. This organization looks like hdf5 , doesn't it? I would like to know if it's possible to do this in Python. And if it is, which package should I use? I looked at h5py but I didn't find a

python h5py: can I store a dataset which different columns have different types?

浪子不回头ぞ 提交于 2019-12-13 02:37:18
问题 Suppose I have a table which has many columns, only a few columns is float type, others are small integers, for example: col1, col2, col3, col4 1.31 1 2 3 2.33 3 5 4 ... How can I store this effectively, suppose I use np.float32 for this dataset, the storage is wasted, because other columns only have a small integer, they don't need so much space. If I use np.int16 , the float column is not exact, which also what I wanted. Therefore how do I deal with the situation like this? Suppose I also

Use HDF5 from intel fortran on windows

别等时光非礼了梦想. 提交于 2019-12-13 02:16:44
问题 I would like to create a HDF5 dataset from a fortran90 program compiled with intel fortran 2011 on Windows 7 using Visual Studio 2010 Can I use prebuilt binaries or how do I build new ones 回答1: I build from source, the available built binaries use the MS C/C++ compiler while I want to build with the Intel compiler, and they are built with Intel Fortran v12.x while I'm using v14.x. I won't say that you can't use the binaries, but I've had enough of a struggle in the past to persuade me to

Is it possible to directly rename pandas dataframe's columns stored in hdf5 file?

烈酒焚心 提交于 2019-12-13 02:16:01
问题 I have a very large pandas dataframe stored in hdf5 file, and I need to rename the columns of the dataframe. The straightforward way is to read the dataframe in chunks using HDFStore.select, rename the columns and store the chunks to another hdf5 file. But I think this is a stupid and inefficient way. Is there a way to directly rename the columns in hdf5 file? 回答1: It can be done by changing the meta-data. BIG WARNING. This may corrupt your file, so you are at your own risk. Create a store.

\[Errno -101\] NetCDF: HDF error when opening netcdf file

狂风中的少年 提交于 2019-12-13 02:00:48
问题 I have this error when opening my netcdf file. The code was working before. How do I fix this ? Traceback (most recent call last): File "", line 1, in ... File "file.py", line 71, in gather_vgt return xr.open_dataset(filename) File "/.../lib/python3.6/site-packages/xarray/backends/api.py", line 286, in open_dataset autoclose=autoclose) File "/.../lib/python3.6/site-packages/xarray/backends/netCDF4_.py", line 275, in open ds = opener() File "/.../lib/python3.6/site-packages/xarray/backends