Scipy ndimage morphology operators saturate my computer memory RAM (8GB)

I need to compute morphological opening for 3D array of shape (400,401,401), size 64320400 bytes using a 3D structure element with a radius of 17 or greater. The size of structure element ndarray is 42875 bytes. Using scipy.ndimage.morphology.binary_opening, the whole process consumes 8GB RAM.

I have read scipy/ndimage/morphology.py on GitHub, and as far as I can tell, the morphology erosion operator is implemented in pure C. It is to difficult for me to understand the ni_morphology.c source, so I haven't found any part of this code which leads to such enormous memory utilization. Adding more RAM is not a workable solution, since memory usage may increase exponentially with the structure element radius.

To reproduce the problem:

import numpy as np
from scipy import ndimage

arr_3D = np.ones((400,401,401),dtype="bool")

str_3D = ndimage.morphology.generate_binary_structure(3,1)
big_str_3D = ndimage.morphology.iterate_structure(str_3D,20)

arr_out_3D = ndimage.morphology.binary_opening(arr_3D, big_str_3D)

This takes approximately 7GB RAM.

Does anyone have some suggestions for how compute morphology in example described above?

I too do openings of increasing radius for granulometry, and I ran into this same problem. In fact, the memory usage increases as roughly R^6 where R is the radius of the spherical kernel. That's quite a rate of increase! I did some memory profiling, including splitting the opening into an erosion and then a dilation (the definition of opening), and found that the large memory usage comes from SciPy's binaries and is cleared as soon as the result is returned to the calling Python script. SciPy's morphology codes are mostly implemented in C, so modifying them is a difficult prospect.

Anyway the OP's last comment: "After some researche I turned to Opening implementation using convolution -> multiplication of Fourier transforms - O(n log n), and no so big memory overhead." helped me figure out the solution, so thanks for that. The implementation however was not obvious at first. For anyone else who happens upon this problem I am going to post the implementation here.

I will start talking about dilation, because binary erosion is just the dilation of the complement (inverse) of a binary image, and then the result is inverted.

In short: according to this white paper by Kosheleva et al, dilation can be viewed as a convolution of the dataset A with the structuring element (spherical kernel) B, thresholded above a certain value. Convolutions can also be done (often much faster) in frequency space, since a multiplication in frequency space is the same as convolution in real space. So by taking the Fourier transform of A and B first, multiplying them, and then inverse-transforming the result, and then thresholding that for values above 0.5, you get the dilation of A with B. (Note that the white paper I linked says to threshold above 0, but much testing showed that that gave wrong results with many artifacts; another white paper by Kukal et al. gives the threshold value as >0.5, and that gave identical results as scipy.ndimage.binary_dilation for me. I'm not sure why the discrepancy, and I wonder if I missed some detail of ref 1's nomenclature)

Proper implementation of that involves padding for size, but luckily for us, it's already been done in scipy.signal.fftconvolve(A,B,'same') - this function does what I just described and takes care of padding for you. Giving the third option as 'same' will return a result the same size as A, which is what we want (otherwise it will be padded out by the size of B).

So dilation is:

from scipy.signal import fftconvolve
def dilate(A,B):
    return fftconvolve(A,B,'same')>0.5

Erosion in principal is this: you invert A, dilate it by B as above, and then re-invert the result. But it requires a slight trick to match exactly the results from scipy.ndimage.binary_erosion - you must pad the inversion with 1s out to at least the radius R of the spherical kernel B. So erosion can be implemented thusly to get identical results to scipy.ndimage.binary_erosion. (Note that the code could be done in fewer lines but I'm trying to be illustrative here.)

from scipy.signal import fftconvolve
import numpy as np
def erode_v1(A,B,R):
    #R should be the radius of the spherical kernel, i.e. half the width of B
    A_inv = np.logical_not(A)
    A_inv = np.pad(A_inv, R, 'constant', constant_values=1)
    tmp = fftconvolve(A_inv, B, 'same') > 0.5
    #now we must un-pad the result, and invert it again
    return np.logical_not(tmp[R:-R, R:-R, R:-R])

You can get identical erosion results another way, as shown in the white paper by Kukal et al - they point out that the convolution of A and B can be made into an erosion by thresholding by > m-0.5 , where m is the "size" of B (which turns out to be the volume of the sphere, not the volume of the array). I showed erode_v1 first because it's slightly easier to understand, but the results are the same here:

from scipy.signal import fftconvolve
import numpy as np
def erode_v2(A,B):
    thresh = np.count_nonzero(B)-0.5
    return fftconvolve(A,B,'same') > thresh

I hope this helps anyone else having this problem. Notes about the results I got:

I tested this in both 2D and 3D and all results were identical to the same answer gotten by scipy.ndimage morphological operations (as well as the skimage operations, which on the back end just call the ndimage ones).
For my largest kernels (R=21), the memory usage was 30x less! The speed was also 20x faster.
I only tested it on binary images though - I just don't know about greyscale, but there is some discussion of that in the second reference below.

Two more quick notes:

First: Consider the padding I discuss in the middle section about erode_v1. Padding the inverse out with 1s basically allows erosion to occur from the edges of the dataset as well as from any interface in the dataset. Depending on your system and what you are trying to do, you may want to consider whether or not this truly represents the way you want it handled. If not, you might consider padding out with the 'reflect' boundary condition, which would simulate a continuation of any features near the edge. I recommend playing around with different boundary conditions (on both dilation and erosion) and visualizing and quantifying the results to determine what suits your system and goals the best.

Second: This frequency-based method is not only better in memory but also in speed - for the most part. For small kernels B, the original method is faster. However, small kernels run very quickly anyway, so for my own purposes I don't care. If you do (like if you are doing a small kernel many times), you may want to find the critical size of B and switch methods at that point.

References, though I apologize that they are not easy to cite as they provide neither year:

Fast Implementation of Morphological Operations Using Fast Fourier Transform by O. Kosheleva, S. D. Cabrera, G. A. Gibson, M. Koshelev. http://www.cs.utep.edu/vladik/misha5.pdf
Dilation and Erosion of Gray Images with Spherical Masks by J. Kukal, D. Majerova, A. Prochazka. http://http%3A%2F%2Fwww2.humusoft.cz%2Fwww%2Fpapers%2Ftcp07%2F001_kukal.pdf

A wild guess would be that the code is trying to decompose the structuring element somehow and doing several parallel computations. Each computation with its own copy of the whole original data. 400x400x400 is not that big tbh...

AFAIK, since you are doing a single opening/closing, it should use, at most, 3x the memory of the original data: original + dilation/erosion + final result...

You could try to implement if yourself by hand... it might be slower, but the code is easy enough and should give some insight into the problem...

来源：https://stackoverflow.com/questions/25034259/scipy-ndimage-morphology-operators-saturate-my-computer-memory-ram-8gb

标签

python