hdf5

Update pandas DataFrame in stored in a Pytable with another pandas DataFrame

▼魔方 西西 提交于 2019-12-05 13:10:45
I am trying to create a function that updates a pandas DataFrame stored that I have stored in a PyTable with new data from a pandas DataFrame. I want to check if some data is missing in the PyTable for specific DatetimeIndexes (value is NaN or a new Timestamp is available), replace this with new values from a given pandas DataFrame and append this to the Pytable. Basically, just update a Pytable. I can get the combined DataFrame using the combine_first method in Pandas. Below the Pytable is created with dummy data: import pandas as pd import numpy as np import datetime as dt index = pd

How to partially copy using python an Hdf5 file into a new one keeping the same structure?

我与影子孤独终老i 提交于 2019-12-05 12:35:32
问题 I have a large hdf5 file that looks something like this: A/B/dataset1, dataset2 A/C/dataset1, dataset2 A/D/dataset1, dataset2 A/E/dataset1, dataset2 ... I want to create a new file with only that: A/B/dataset1, dataset2 A/C/dataset1, dataset2 What is the easiest way in python? I did: fs = h5py.File('source.h5', 'r') fd = h5py.File('dest.h5', 'w') fs.copy('group B', fd) the problem is that I get for dest.h5: B/dataset1, dataset2 and that I am missing part of the arborescence. 回答1: fs.copy('A/B

How can I load a data frame saved in pandas as an HDF5 file in R?

限于喜欢 提交于 2019-12-05 08:28:22
I saved a data frame in pandas in an HDF5 file: import numpy as np import pandas as pd np.random.seed(1) frame = pd.DataFrame(np.random.randn(4, 3), columns=list('bde'), index=['Utah', 'Ohio', 'Texas', 'Oregon']) print('frame: {0}'.format(frame)) store = pd.HDFStore('file.h5') store['df'] = frame store.close() The frame looks as follows: frame: b d e Utah 1.624345 -0.611756 -0.528172 Ohio -1.072969 0.865408 -2.301539 Texas 1.744812 -0.761207 0.319039 Oregon -0.249370 1.462108 -2.060141 I am trying to load it in R: #source("http://bioconductor.org/biocLite.R") #biocLite("rhdf5") library(rhdf5)

How to write a Pandas Dataframe into a HDF5 dataset

假如想象 提交于 2019-12-05 08:01:48
I'm trying to write data from a Pandas dataframe into a nested hdf5 file, with multiple groups and datasets within each group. I'd like to keep it as a single file which will grow in the future on a daily basis. I've had a go with the following code, which shows the structure of what I'd like to achieve import h5py import numpy as np import pandas as pd file = h5py.File('database.h5','w') d = {'one' : pd.Series([1., 2., 3.], index=['a', 'b', 'c']), 'two' : pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])} df = pd.DataFrame(d) groups = ['A','B','C'] for m in groups: group = file.create

For python, install hdf5/netcdf4

主宰稳场 提交于 2019-12-05 05:57:39
Doing this on a Linux Mint 17.1. When I try: pip install hdf5 I get the error "Could not find a version that satisfies the requirement hdf5 (from versions: ) No matching distribution found for hdf5" I'm trying in the long run to install netcdf4 but can't do that until I get hdf5 installed. Supposedly from when I was trying to do this last week, with netcdf4, I should be using the pip install netcdf4, err hdf5...at least maybe in the case of hdf5. If I try pip install h5py I get that the message saying: Requirement already satisfied (use --upgrade to upgrade): h5py in ./anaconda3/lib/python3.5

hdf5 in maven project

为君一笑 提交于 2019-12-05 04:08:41
I'm trying to import hdf.hdf5lib.H5 into my maven project in NetBeans. It has this as import line import hdf.hdf5lib.H5; as suggested here: https://support.hdfgroup.org/products/java/JNI3/jhi5/index.html However, it throws this exception: java.lang.ExceptionInInitializerError Caused by: java.lang.RuntimeException: Uncompilable source code - package hdf.hdf5lib does not exist NetBeans already warned me about it by saying at the import line "packadge does not excist". So I let it "search dependencies at Maven repositories". It does find something and it adds this to my pom.xml: <dependency>

Deleting hdf5 dataset using h5py

心已入冬 提交于 2019-12-05 00:14:11
Is there any way to remove a dataset from an hdf5 file, preferably using h5py? Or alternatively, is it possible to overwrite a dataset while keeping the other datasets intact? To my understanding, h5py can read/write hdf5 files in 5 modes f = h5py.File("filename.hdf5",'mode') where mode can be r for read, r+ for read-write, a for read-write but creates a new file if it doesn't exist, w for write/overwrite, and w- which is same as w but fails if file already exists. I have tried all but none seem to work. Any suggestions are much appreciated. EnemyBagJones Yes, this can be done. with h5py.File

Loading Matlab sparse matrix saved with -v7.3 (HDF5) into Python and operating on it

喜夏-厌秋 提交于 2019-12-04 23:41:41
问题 I'm new to python, coming from matlab. I have a large sparse matrix saved in matlab v7.3 (HDF5) format. I've so far found two ways of loading in the file, using h5py and tables . However operating on the matrix seems to be extremely slow after either. For example, in matlab: >> whos Name Size Bytes Class Attributes M 11337x133338 77124408 double sparse >> tic, sum(M(:)); toc Elapsed time is 0.086233 seconds. Using tables: t = time.time() sum(f.root.M.data) elapsed = time.time() - t print

g++ compile error: undefined reference to a shared library function which exists

柔情痞子 提交于 2019-12-04 23:26:24
I recently installed the hdf5 library on an ubuntu machine, and am now having trouble linking to the exported functions. I wrote a simple test script readHDF.cpp to explain the issue: #include <hdf5.h> int main(int argc, char * argv[]) { hid_t h5_file_id = H5Fopen(argv[1], H5F_ACC_RDWR, H5P_DEFAULT); return 0; } The compile command is g++ -Wl,-rpath,$HOME/hdf5/lib -I$HOME/hdf5/include \ -L$HOME/hdf5/lib -l:$HOME/hdf5/lib/libhdf5.so readHDF.cpp which returns the following error /tmp/cc6DXdxV.o: In function `main': readHDF.cpp:(.text+0x1f): undefined reference to `H5check_version' readHDF.cpp:(

Deleting information from an HDF5 file

▼魔方 西西 提交于 2019-12-04 22:58:40
I realize that a SO user has formerly asked this question but it was asked in 2009 and I was hoping that more knowledge of HDF5 was available or newer versions had fixed this particular issue. To restate the question here concerning my own problem; I have a gigantic file of nodes and elements from a large geometry and have already retrieved all the useful information I need from it. Therefore, in Python, I am trying to keep the original file, but delete the information I do not need and fill in more information for other sources. For example, I have a dataset of nodes that I don't need.