Generating pcolormesh images from very large data sets saved in H5 files with Python

耗尽温柔 提交于 2019-12-01 06:03:44

问题


I am collecting a large amount of data that will be saved into individual H5 files using h5py. I would like to patch these images together into one pcolormesh plot to be saved as a single image.

A quick example I have been working on generates arrays of 2000x2000 random data points and saves them in H5 files using h5py. Then I try to import the data in these files and try to plot it in matplotlib as a pcolormesh, but I always run into a memoryError (which is expected).

import numpy
import h5py
arr = numpy.random.random((2000,2000))

with h5py.File("TEST_HDF5_SAVE_FILES\\Plot_0.h5", "w") as f:
    dset = f.create_dataset("Plot_0", data = arr)

for i in range(1,100):
    arr = numpy.random.random((2000,2000))
    with h5py.File("TEST_HDF5_SAVE_FILES\\Plot_" + str(i) + ".h5", "w") as f:
        dset = f.create_dataset("Plot_" + str(i), data = arr)

This script generates my files. I picked 100 as an arbitrary number just to have a large enough set of files to pull from.

Then I import them using the following script:

y = numpy.arange(0, 2000, 1)

for display_plot_num in range(0, 5):
    print display_plot_num
    x = numpy.arange(display_plot_num*2000, display_plot_num*2000 + 2000, 1)

    with h5py.File("TEST_HDF5_SAVE_FILES\\Plot_" + str(display_plot_num) + ".h5", "r+") as f:
        data = f["Plot_" + str(display_plot_num)]
        plt.pcolormesh(x, y, data)
plt.show()

The range value in the for loop can be altered up until 100, but the maximum value I can choose without a memory error is 5 (i.e. 5 plots can be patched on a pcolormesh plot in matplotlib) and it is extremely clunky and slow. I need to be able to patch together many images.

Is there any other technique I should use to plot this data? Or it would be nice if I could just convert the data from multiple H5 files into an image without going through matplotlib or a similar program (like scipy).

In summary, my problem is this:

  • I have a large number of HDF5 files with image data (2000x2000)
  • I need to patch together these files into a single image and save it

Any help is appreciated. Also, I would be glad to answer any further questions about my problem.


Edit (5.6.2013):

I feel a similar question would be how to deal (import, manipulate, edit, etc.) with very high resolution images in Python. This is essentially what I am trying to do; generate a very high resolution image from a collection of smaller images.


回答1:


One way to reduce the bloat of images in matplotlib (especially when saving to SVG) is to use the rasterized=True kwarg. This will essentially "flatten" your pcolormesh, which makes it much faster to save, uses less resources, etc.



来源:https://stackoverflow.com/questions/16921997/generating-pcolormesh-images-from-very-large-data-sets-saved-in-h5-files-with-py

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!