How to write a Pandas Dataframe into a HDF5 dataset

假如想象 提交于 2019-12-05 08:01:48

df.to_hdf() expects a string as a key parameter (second parameter):

key : string

identifier for the group in the store

so try this:

df.to_hdf('database.h5', ds.name, table=True, mode='a')

where ds.name should return you a string (key name):

In [26]: ds.name
Out[26]: '/A1'
AleVis

I thought to have a go with pandas\pytables and the HDFStore class instead of h5py. So I tried the following

import numpy as np
import pandas as pd

db = pd.HDFStore('Database.h5')

index = pd.date_range('1/1/2000', periods=8)

df = pd.DataFrame(np.random.randn(8, 3), index=index, columns=['Col1', 'Col2', 'Col3'])

groups = ['A','B','C']     

i = 1    

for m in groups:

    subgroups = ['d','e','f']

    for n in subgroups:

        db.put(m + '/' + n, df, format = 'table', data_columns = True)

It works, 9 groups (groups instead of datasets in pyatbles instead fo h5py?) created from A/d to C/f. Columns and indexes preserved and can do the dataframe operations I need. Still wondering though whether this is an efficient way to retrieve data from a specific group which will become huge in the the future i.e. operations like

db['A/d'].Col1[4:]
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!