Performance: Matlab vs Python

前端 未结 5 1925
长情又很酷
长情又很酷 2020-11-29 13:06

I recently switched from Matlab to Python. While converting one of my lengthy codes, I was surprised to find Python being very slow. I

5条回答
  •  星月不相逢
    2020-11-29 13:27

    Comparing Jit-Compilers

    It has been mentioned that Matlab uses an internal Jit-compiler to get good performance on such tasks. Let's compare Matlabs jit-compiler with a Python jit-compiler (Numba).

    Code

    import numba as nb
    import numpy as np
    import math
    import time
    
    #If the arrays are somewhat larger it makes also sense to parallelize this problem
    #cache ==True may also make sense
    @nb.njit(fastmath=True) 
    def exampleKernelA(M, x, N, y):
      """Example kernel function A"""
      #explicitly declaring the size of the second dim also improves performance a bit
      assert x.shape[1]==2
      assert y.shape[1]==2
    
      #Works with all dtypes, zeroing isn't necessary
      kernel = np.empty((M,N),dtype=x.dtype)
      for i in range(M):
        for j in range(N):
          # Define the custom kernel function here
          kernel[i, j] = np.sqrt((x[i, 0] - y[j, 0]) ** 2 + (x[i, 1] - y[j, 1]) ** 2)
      return kernel
    
    
    def exampleKernelB(M, x, N, y):
        """Example kernel function A"""
        # Euclidean norm function implemented using meshgrid idea.
        # Fastest
        x0, y0 = np.meshgrid(y[:, 0], x[:, 0])
        x1, y1 = np.meshgrid(y[:, 1], x[:, 1])
        # Define custom kernel here
        kernel = np.sqrt((x0 - y0) ** 2 + (x1 - y1) ** 2)
        return kernel
    
    @nb.njit() 
    def exampleKernelC(M, x, N, y):
      """Example kernel function A"""
      #explicitly declaring the size of the second dim also improves performance a bit
      assert x.shape[1]==2
      assert y.shape[1]==2
    
      #Works with all dtypes, zeroing isn't necessary
      kernel = np.empty((M,N),dtype=x.dtype)
      for i in range(M):
        for j in range(N):
          # Define the custom kernel function here
          kernel[i, j] = np.sqrt((x[i, 0] - y[j, 0]) ** 2 + (x[i, 1] - y[j, 1]) ** 2)
      return kernel
    
    
    #Your test data
    xVec = np.array([
        [49.7030,  78.9590],
        [42.6730,  11.1390],
        [23.2790,  89.6720],
        [75.6050,  25.5890],
        [81.5820,  53.2920],
        [44.9680,   2.7770],
        [38.7890,  78.9050],
        [39.1570,  33.6790],
        [33.2640,  54.7200],
        [4.8060 ,  44.3660],
        [49.7030,  78.9590],
        [42.6730,  11.1390],
        [23.2790,  89.6720],
        [75.6050,  25.5890],
        [81.5820,  53.2920],
        [44.9680,   2.7770],
        [38.7890,  78.9050],
        [39.1570,  33.6790],
        [33.2640,  54.7200],
        [4.8060 ,  44.3660]
        ])
    
    #compilation on first callable
    #can be avoided with cache=True
    res=exampleKernelA(xVec.shape[0], xVec, xVec.shape[0], xVec)
    res=exampleKernelC(xVec.shape[0], xVec, xVec.shape[0], xVec)
    
    t1=time.time()
    for i in range(10_000):
      res=exampleKernelA(xVec.shape[0], xVec, xVec.shape[0], xVec)
    
    print(time.time()-t1)
    
    t1=time.time()
    for i in range(10_000):
      res=exampleKernelC(xVec.shape[0], xVec, xVec.shape[0], xVec)
    
    print(time.time()-t1)
    
    t1=time.time()
    for i in range(10_000):
      res=exampleKernelB(xVec.shape[0], xVec, xVec.shape[0], xVec)
    
    print(time.time()-t1)
    

    Performance

    exampleKernelA: 0.03s
    exampleKernelC: 0.03s
    exampleKernelB: 1.02s
    Matlab_2016b (your code, but 10000 rep., after few runs): 0.165s
    

提交回复
热议问题