可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I am using the HDF5Store, to store some of my processed results, prior to analysis. Into the store I want to put 3 types of results,

Raw results, that have not been processed at all, just read-in and merged from their original CSV formats
Processed results that are derived from the raw results, that have some proccessing and division into more logical groupings
Summarised results that have useful summery columns added and redundant columns removed, for easy reading.

I thought a HDF5Store with Hierarchical Keys would do it, one for Raw, one for Processed and one for Summarised.

I wanted a structure like:

<class 'pandas.io.pytables.HDFStore'> File path: results.h5 /proccessed/dbn_reinit                           frame        (shape->[22880,19]) /proccessed/dbn_rerep_code                       frame        (shape->[11440,18]) /proccessed/dbn_rerep_enhanced_input             frame        (shape->[11440,18]) /proccessed/linear_classifier                    frame        (shape->[572,18])   /proccessed/msda_rerep_code                      frame        (shape->[18304,17]) /proccessed/msda_rerep_enhanced_input            frame        (shape->[18304,17]) /raw/dbn_reinit                                  frame        (shape->[22880,15]) /raw/dbn_rerep                                   frame        (shape->[23452,15]) /raw/msda_rerep                                  frame        (shape->[36608,14]) /summerised/dbn_reinit                           frame        (shape->[22880,10]) /summerised/dbn_rerep_code                       frame        (shape->[11440,9])  /summerised/dbn_rerep_enhanced_input             frame        (shape->[11440,9])  /summerised/linear_classifier                    frame        (shape->[572,6])    /summerised/msda_rerep_code                      frame        (shape->[18304,10]) /summerised/msda_rerep_enhanced_input            frame        (shape->[18304,10])

I expected I could create this by saying:

store = pandas.HDF5Store('results.h5') store.add_group('raw') raw_store = store['raw']  raw_store['dbn_reinit'] = dbn_reinit_dataframe raw_store['dbn_rerep_code'] = dbn_rerep_code_dataframe ...

etc

However there doesn't seem to be a method of getting a subgroup of a store and using it as it it was a store,

so i had to do:

store = pd.HDFStore('results.h5', mode='w')  store['raw/dbn_reinit'] = dbn_reinit_dataframe store['raw/dbn_rerep'] = dbn_reinit_dataframe ...

which is wordy, and doesn't really show any kind of grouping of the results into the 3 catagories Am i missing something? Or is the Hieratrchical features of the HDF, just writing really long key names that have /s in them?

回答1:

docs on using the hierarchical keys are here. .remove() has this type of functionaility, where you can remove nodes at that level and further down the tree.

You can do: store.get_storer('foo') to return an object that includes access to the node. (e.g. .group). However, this object won't allow you to add/select sub-nodes, nor does it provide a nice repr of that node.

You could put in a feature request for these features on github. Please include a reproducible example of what you think this should do.

Pull-requests are welcome!

I rarely use multiple groups. Mainly because of the flexibility of using different files. You can do what you are trying to do, I just have never found a need for it (e.g. treat your group as the file itself). HDF5 is not a database so this is rarely useful

文章来源: How do I read/write to a subgroup withing a HDF5Store?

标签

dbn