python convolution with different dimension

前端 未结 1 1948
旧巷少年郎
旧巷少年郎 2020-12-21 00:33

I\'m trying to implement convolutional neural network in Python.
However, when I use signal.convolve or np.convolve, it can not do convolution on X, Y(X is 3d, Y is 2d).

相关标签:
1条回答
  • 2020-12-21 01:21

    Scipy implements standard N-dimensional convolutions, so that the matrix to be convolved and the kernel are both N-dimensional.

    A quick fix would be to add an extra dimension to Y so that Y is 3-Dimensional:

    result = signal.convolve(X, Y[..., None], 'valid')
    

    I'm assuming here that the last axis corresponds to the image index as in your example [width, height, image_idx] (or [height, width, image_idx]). If it is the other way around and the images are indexed in the first axis (as it is more common in C-ordering arrays) you should replace Y[..., None] with Y[None, ...].

    The line Y[..., None] will add an extra axis to Y, making it 3-dimensional [kernel_width, kernel_height, 1] and thus, converting it to a valid 3-Dimensional convolution kernel.

    NOTE: This assumes that all your input mini-batches have the same width x height, which is standard in CNN's.


    EDIT: Some timings as @Divakar suggested.

    The testing framework is setup as follows:

    def test(S, N, K):
        """ S: image size, N: num images, K: kernel size"""
        a = np.random.randn(S, S, N)
        b = np.random.randn(K, K)
        valid = [slice(K//2, -K//2+1), slice(K//2, -K//2+1)]
    
        %timeit signal.convolve(a, b[..., None], 'valid')
        %timeit signal.fftconvolve(a, b[..., None], 'valid')
        %timeit ndimage.convolve(a, b[..., None])[valid]
    

    Find bellow tests for different configurations:

    • Varying image size S:

      >>> test(100, 50, 11) # 100x100 images
      1 loop, best of 3: 909 ms per loop
      10 loops, best of 3: 116 ms per loop
      10 loops, best of 3: 54.9 ms per loop
      
      >>> test(1000, 50, 11) # 1000x1000 images
      1 loop, best of 3: 1min 51s per loop
      1 loop, best of 3: 16.5 s per loop
      1 loop, best of 3: 5.66 s per loop
      
    • Varying number of images N:

      >>> test(100, 5, 11) # 5 images
      10 loops, best of 3: 90.7 ms per loop
      10 loops, best of 3: 26.7 ms per loop
      100 loops, best of 3: 5.7 ms per loop
      
      >>> test(100, 500, 11) # 500 images
      1 loop, best of 3: 9.75 s per loop
      1 loop, best of 3: 888 ms per loop
      1 loop, best of 3: 727 ms per loop
      
    • Varying kernel size K:

      >>> test(100, 50, 5) # 5x5 kernels
      1 loop, best of 3: 217 ms per loop
      10 loops, best of 3: 100 ms per loop
      100 loops, best of 3: 11.4 ms per loop
      
      >>> test(100, 50, 31) # 31x31 kernels
      1 loop, best of 3: 4.39 s per loop
      1 loop, best of 3: 220 ms per loop
      1 loop, best of 3: 560 ms per loop
      

    So, in short, ndimage.convolve is always faster, except when the kernel size is very large (as K = 31 in the last test).

    0 讨论(0)
提交回复
热议问题