hdf5

Faster reading of time series from netCDF?

橙三吉。 提交于 2020-01-10 08:51:20
问题 I have some large netCDF files that contain 6 hourly data for the earth at 0.5 degree resolution. There are 360 latitude points, 720 longitude points, and 1420 time points per year. I have both yearly files (12 GB ea) and one file with 110 years of data (1.3 TB) stored as netCDF-4 (here is an example of the 1901 data, 1901.nc, its use policy, and the original, public files that I started with). From what I understood, it should be faster to read from one netCDF file rather than looping over

Getting multiple datasets from group in HDF5

我只是一个虾纸丫 提交于 2020-01-07 03:05:52
问题 I am comparing two different hdf5 files to make sure that they match. I want to create a list with all of the datasets in the group in the hdf5 file so that I can have a loop run through all of the datasets, instead of entering them manually. I cant seem to find away to do this. Currently I am getting the data set by using this code: tdata21 = ft['/PACKET_0/0xeda9_data_0004'] The names of the sets are located in the "PACKET_0" group. Once I arrange all of the datasets, I compare the data in

Datatype class: H5T_FLOAT F0413 08:54:40.661201 17769 hdf5_data_layer.cpp:53] Check failed: hdf_blobs_[i] ->shape(0) == num (1 vs. 1024)

試著忘記壹切 提交于 2020-01-06 15:27:13
问题 My data set is a HDF5 file consists of data with shape [129028,1,12,1024] and label of shape [129028,1,1,1] . But when I run solver.prototxt, I get the error message: I0413 08:54:34.689985 17769 hdf5.cpp:32] Datatype class: H5T_FLOAT F0413 08:54:40.661201 17769 hdf5_data_layer.cpp:53] Check failed: hdf_blobs_[i] ->shape(0) == num (1 vs. 1024) *** Check failure stack trace: *** 回答1: It looks like you saved your hdf5 from matlab, rather than python (judging by your previous question). When

How can I combine multiple .h5 file?

倾然丶 夕夏残阳落幕 提交于 2020-01-06 08:24:29
问题 Everything that is available online is too complicated. My database is large to I exported it in parts. I now have three .h5 file and I would like to combine them into one .h5 file for further work. How can I do it? 回答1: There are at least 3 ways to combine data from individual HDF5 files into a single file: Use external links to create a new file that points to the data in your other files (requires pytables/tables module) Copy the data with the HDF Group utility: h5copy.exe Copy the data

C++ void pointer

别等时光非礼了梦想. 提交于 2020-01-06 07:17:07
问题 I am using an HDF5 library to read data from an HDF5 file in c++ and the call I am having problems with is the following: status = H5Dread( hdf5_dataset, hdf5_datatype, hdf5_dataspace_in_memory, hdf5_dataspace_in_file, H5P_DEFAULT, buf ); The last argument is supposed to be a void pointer, and I have a vector of floats that I want to be allocated, however when I try to pass the vector g++ gives me the following error: error: cannot convert ‘std::vector<float, std::allocator<float> >’ to ‘void

How to de-reference a list of external links using pytables?

六眼飞鱼酱① 提交于 2020-01-06 05:45:04
问题 I have created external links leading from one hdf5 file to another using pytables. My question is how to de-reference it in a loop? for example: Let's assume file_name = "collection.h5" , where external links are stored I created external links under the root node and when i traverse the nodes under the root, i get the following output : /link1 (ExternalLink) -> /files/data1.h5:/weights/Image /link2 (ExternalLink) -> /files/data2.h5:/weights/Image and so on, I know that for de-referencing a

(In Pandas) Why is frequency information lost when storing in HDF5 as a Table?

给你一囗甜甜゛ 提交于 2020-01-05 07:59:07
问题 I am storing timeseries data in HDF5 format within pandas, Because I want to be able to access the data directly on disk I am using the PyTable format with table=True when writing. It appears that I then loose frequency information on my TimeSeries objects after writing them to HDF5. This can be seen by toggling is_table value in script below: import pandas as pd is_table = False times = pd.date_range('2000-1-1', periods=3, freq='H') series = pd.Series(xrange(3), index=times) print 'frequency

HDFStore start stop not working

两盒软妹~` 提交于 2020-01-05 07:34:46
问题 Is it clear what I am doing wrong? I'm experimenting with pandas HDFStore.select start and stop options and it's not making a difference. The commands I'm using are: import pandas as pd hdf = pd.HDFStore(path % 'results') len(hdf.select('results',start=15,stop=20)) hoping to get a length of 4 or 5 or however it's counted, but it gives me the whole darn dataframe. Here is a screenshot: 回答1: When writing to the h5 file, select pandas.to_hdf(<path>,<key>,format='tables') which enables subsets of

Can I store a file (HDF5 file) in another file with serialization?

↘锁芯ラ 提交于 2020-01-05 04:41:10
问题 I have a HDF5 file and a list of objects that I need to store for saving functionality. For simplicity I want to create only one save file. Can I store H5 file, in my save file that I create with serialization (pickle) without opening H5 file. 回答1: You can put several files in one by using zipfile or tarfile for zipfile you would write the database files and writestr your pickle.dumps ed data. for tarfile you would add the database file and gettarinfo , addfile your pickle.dump ed data from a

How can I loop over HDF5 groups in Python removing rows according to a mask?

不羁岁月 提交于 2020-01-05 04:05:40
问题 I have an HDF5 file containing a number of different groups all of which have the same number of rows. I also have a Boolean mask for rows to keep or remove. I would like to iterate over all groups in the HDF5 file removing rows according to the mask. The recommended method to recursively visit all groups is visit(callable) , but I can't work out how to pass my mask to the callable. Here is some code hopefully demonstrating what I would like to do but which doesn't work: def apply_mask(name,