hdf5

How can extract data from .h5 file and save it in .txt or .csv properly?

扶醉桌前 提交于 2021-01-29 19:35:46
问题 After searching a lot I couldn't find a simple way to extract data from .h5 and pass it to a data.Frame by Numpy or Pandas in order to save in .txt or .csv file. import h5py import numpy as np import pandas as pd filename = 'D:\data.h5' f = h5py.File(filename, 'r') # List all groups print("Keys: %s" % f.keys()) a_group_key = list(f.keys())[0] # Get the data data = list(f[a_group_key]) pd.DataFrame(data).to_csv("hi.csv") Keys: <KeysViewHDF5 ['dd48']> When I print data I see following results:

I want to convert very large csv data to hdf5 in python

◇◆丶佛笑我妖孽 提交于 2021-01-29 17:37:12
问题 I have a very large csv data. It looks like this. [Date, Firm name, value 1, value 2, ..., value 60] I want to convert this to a hdf5 file. For example, let's say I have two dates (2019-07-01, 2019-07-02), each date has 3 firms (firm 1, firm 2, firm 3) and each firm has [value 1, value 2, ... value 60]. I want to use date and firm name as a group. Specifically, I want this hierarchy: 'Date/Firm name'. For example, 2019-07-01 has firm 1, firm 2, and firm 3. When you look at each firm, there

How in python 3.6 to get data array from hdf5 file if dtype is “<u4”?

会有一股神秘感。 提交于 2021-01-29 16:21:29
问题 I want to get dataset with format {N, 16, 512, 128} as 4D numpy array from hdf5 file. N is a number of 3D arrays with {16, 512, 128} format. I try to do this: import os import sys import h5py as h5 import numpy as np import subprocess import re file_name = sys.argv[1] path = sys.argv[2] f = h5.File(file_name, 'r') data = f[path] print(data.shape) #{27270, 16, 512, 128} print(data.dtype) #"<u4" data = np.array(data, dtype=np.uint32) print(data.shape) Unfortunately, after data = np.array(data,

using H5T_ARRAY in Python

孤人 提交于 2021-01-29 08:18:47
问题 I am trying to use H5T_ARRAY inside the H5T_COMPOUND structure using Python. Basically, I am writing hdf5 file and if you open it using H5Dump, the structure looks like this. HDF5 "SO_64449277np.h5" { GROUP "/" { DATASET "Table3" { DATATYPE H5T_COMPOUND { H5T_COMPOUND { H5T_STD_I16LE "id"; H5T_STD_I16LE "timestamp"; } "header"; H5T_COMPOUND { H5T_IEEE_F32LE "latency"; H5T_STD_I16LE "segments_k"; H5T_COMPOUND { H5T_STD_I16LE "segment_id"; H5T_IEEE_F32LE "segment_quality"; H5T_IEEE_F32LE

Python: Can I write to a file without loading its contents in RAM?

回眸只為那壹抹淺笑 提交于 2021-01-29 08:03:56
问题 Got a big data-set that I want to shuffle. The entire set won't fit into RAM so it would be good if I could open several files (e.g. hdf5, numpy) simultaneously, loop through my data chronologically and randomly assign each data-point to one of the piles (then afterwards shuffle each pile). I'm really inexperienced with working with data in python so I'm not sure if it's possible to write to files without holding the rest of its contents in RAM (been using np.save and savez with little

Python: Can I write to a file without loading its contents in RAM?

淺唱寂寞╮ 提交于 2021-01-29 07:56:36
问题 Got a big data-set that I want to shuffle. The entire set won't fit into RAM so it would be good if I could open several files (e.g. hdf5, numpy) simultaneously, loop through my data chronologically and randomly assign each data-point to one of the piles (then afterwards shuffle each pile). I'm really inexperienced with working with data in python so I'm not sure if it's possible to write to files without holding the rest of its contents in RAM (been using np.save and savez with little

pytables add repetitive subclass as column

早过忘川 提交于 2021-01-29 07:53:57
问题 I am creating a HDF5 file with strict parameters. It has 1 table consisting of variable columns. At one point the columns become repetitive with the different data being appended. Apparently, I can't add loop inside IsDescription class. Currently the class Segments has been added under class Summary_data twice. I need to call segments_k 70 times. What is the best approach to it? Thank you. class Header(IsDescription): _v_pos = 1 id = Int16Col(dflt=1, pos = 0) timestamp = Int16Col(dflt=1, pos

Adding new data into HDF5 file results an empty array

点点圈 提交于 2021-01-28 19:00:59
问题 While playing with HDF5 package for Python I discovered a strange behavior. I want to insert more data into table. But somehow I cannot get it work properly. As you can see from the source code, I am getting the last row of data in key 'X' using fromRow = hf["X"].shape[0] and writing the tempArray2 afterwards. Result is an empty table. import h5py tempArray1 = [[0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0

AttributeError: 'int' object has no attribute 'encode' HDF5

夙愿已清 提交于 2021-01-28 05:41:06
问题 I'm trying to open a HDF5 file in Python using the following code: with h5py.File('example.hdf5', 'r') as f: ls = list(f.keys()) dat = f.get('data') dt = np.array(dat) However, I get this error when executing the last line: AttributeError: 'int' object has no attribute 'encode' dat has the following class: h5py._hl.group.Group' Anyone knows where the error could come? The output from iterating inside the file is the following. How can I access inside each part of the file: checking hdf5 file

Object dtype dtype('O') has no native HDF5 equivalent

99封情书 提交于 2021-01-27 11:54:59
问题 Well, it seems like a couple of similar questions were asked here in stack overflow, but none of them seem like answered correctly or properly, nor they described the exact examples. I have a problem with saving array or list into hdf5 ... I have a several files contains list of (n, 35) dimensions, where n may be different in each file. Each of them can be saved in hdf5 with code below. hdf = hf.create_dataset(fname, data=d) However, if I want to merge them to make in 3d the error occurs as