Fastest 2D convolution or image filter in Python

前端未结

关注

 5  1582

Several users have asked about the speed or memory consumption of image convolutions in numpy or scipy [1, 2, 3, 4]. From the responses and my experience using Numpy, I bel

相关标签:

5条回答

失恋的感觉

2020-12-24 04:04

I have been trying to improve the convolution speed in my application and I have been using signal.correlate which happens to be about 20 times slower than signal.correlate2d, my input matrices are smaller(27x27 and 5x5). As of 2018, this is what I observed on my machine(Dell Inspiron 13, Core i5) for the specified matrices in the actual question.

OpenCV did the best but the caveat with that is that it doesn't given "mode" options. Input and Output are of the same size.

>>> img= np.random.rand(2048,2048)
>>> kernel = np.ones((32,32), dtype=np.float)
>>> t1= time.time();dst1 = cv2.filter2D(img,-1,kernel);print(time.time()-t1)
0.208490133286
>>> t1= time.time();dst2 = signal.correlate(img,kernel,mode='valid',method='fft');print(time.time()-t1)
0.582989931107
>>> t1= time.time();dst3 = signal.convolve2d(img,kernel,mode='valid');print(time.time()-t1)
11.2672450542
>>> t1= time.time();dst4 = signal.correlate2d(img,kernel,mode='valid');print(time.time()-t1)
11.2443971634
>>> t1= time.time();dst5 = signal.fftconvolve(img,kernel,mode='valid');print(time.time()-t1)
0.581533193588

0 讨论(0)

甜味超标

2020-12-24 04:07
On my machine, a hand-crafted circular convolution using FFTs seems to be fasted:
```
import numpy
x = numpy.random.random((2048, 2048)).astype(numpy.float32)
y = numpy.random.random((32, 32)).astype(numpy.float32)
z = numpy.fft.irfft2(numpy.fft.rfft2(x) * numpy.fft.rfft2(y, x.shape))
```
Note that this might treat the areas close to the edges differently than other ways, because it's a circular convolution.
0 讨论(0)
发布评论:

提交评论
- 加载中...

孤独总比滥情好

2020-12-24 04:14

Scipy has a function fftconvolve, that can be used for 1D and 2D signals.

from scipy import signal
from scipy import misc
import numpy as np
import matplotlib.pyplot as plt

face = misc.face(gray=True)
kernel = np.outer(signal.gaussian(70, 8), signal.gaussian(70, 8))
blurred = signal.fftconvolve(face, kernel, mode='same')

fig, (ax_orig, ax_kernel, ax_blurred) = plt.subplots(3, 1, figsize=(6, 15))
ax_orig.imshow(face, cmap='gray')
ax_orig.set_title('Original')
ax_orig.set_axis_off()
ax_kernel.imshow(kernel, cmap='gray')
ax_kernel.set_title('Gaussian kernel')
ax_kernel.set_axis_off()
ax_blurred.imshow(blurred, cmap='gray')
ax_blurred.set_title('Blurred')
ax_blurred.set_axis_off()
fig.show()

0 讨论(0)

谎友^

2020-12-24 04:16
I did some experiments with this too. My guess is that the SciPy convolution does not use the BLAS library to accelerate the computation. Using BLAS, I was able to code a 2D convolution that was comparable in speed to MATLAB's. It's more work, but your best bet is to recode the convolution in C++.

Here is the tight part of the loop (please forgive the weird () based array referencing, it is my convenience class for MATLAB arrays) The key part is that you don't iterate over the image, you iterate over the filter and let BLAS iterate over the image, because typically the image is much larger than the filter.
```
for(int n = 0; n < filt.numCols; n++)
  {
    for(int m = 0; m < filt.numRows; m++)
    {
      const double filt_val = filt(filt.numRows-1-m,filt.numCols-1-n);
      for (int i =0; i < diffN; i++)
      {
        double *out_ptr = &outImage(0,i);
        const double *im_ptr = &image(m,i+n);
        cblas_daxpy(diffM,filt_val,im_ptr, 1, out_ptr,1);

      }
   }
 }
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
灰色年华

2020-12-24 04:21
It really depends on what you want to do... A lot of the time, you don't need a fully generic (read: slower) 2D convolution... (i.e. If the filter is separable, you use two 1D convolutions instead... This is why the various scipy.ndimage.gaussian, scipy.ndimage.uniform, are much faster than the same thing implemented as a generic n-D convolutions.)

At any rate, as a point of comparison:
```
t = timeit.timeit(stmt='ndimage.convolve(x, y, output=x)', number=1,
setup="""
import numpy as np
from scipy import ndimage
x = np.random.random((2048, 2048)).astype(np.float32)
y = np.random.random((32, 32)).astype(np.float32)
""")
print t
```
This takes 6.9 sec on my machine...

Compare this with fftconvolve
```
t = timeit.timeit(stmt="signal.fftconvolve(x, y, mode='same')", number=1,
setup="""
import numpy as np
from scipy import signal
x = np.random.random((2048, 2048)).astype(np.float32)
y = np.random.random((32, 32)).astype(np.float32)
""")
print t
```
This takes about 10.8 secs. However, with different input sizes, using fft's to do a convolution can be considerably faster (Though I can't seem to come up with a good example, at the moment...).
0 讨论(0)
发布评论:

提交评论
- 加载中...