Error in calculating integral for 2D interpolation. Comparing numpy arrays

前端 未结 2 1222
面向向阳花
面向向阳花 2020-12-11 12:47

My optimization task deals with calculation of the following integral and finding the best values of xl and xu:

Iterations take to

相关标签:
2条回答
  • 2020-12-11 13:20

    I can't reproduce your error unless I omit the np.vectorize decorator. Setting xl/xu values that coincide does give me a ZeroDivisionError though.

    Anyway, there's nothing stopping you from checking the values of xu vs xl in your higher-level function. That way you can skip integration entirely for nonsensical data points and return np.nan early:

    import numpy as np
    import mpmath
    import scipy.integrate as integrate
    
    def k_integrand(x, xl, xu):    
        return ((x**2)*mpmath.exp(x))/((xu - xl)*(mpmath.exp(x)-1)**2)
    
    @np.vectorize   
    def K(xl, xu):
        if xu <= xl:
            # don't even try to integrate
            return np.nan
        y, err = integrate.quad(k_integrand, xl, xu, args = (xl, xu))
        return y
    
    grid_xl = np.linspace(0.1,1,10)        # shape (10,) ~ (1,10)
    grid_xu = np.linspace(0.5,4,8)[:,None] # shape (8,1)
    

    With these definitions I get (following np.set_printoptions(linewidth=200) for easier comparison:

    In [35]: K(grid_xl, grid_xu)
    Out[35]: 
    array([[0.99145351, 0.98925197, 0.98650808, 0.98322919,        nan,        nan,        nan,        nan,        nan,        nan],
           [0.97006703, 0.96656815, 0.96254363, 0.95800307, 0.95295785, 0.94742104, 0.94140733, 0.93493293, 0.9280154 ,        nan],
           [0.93730403, 0.93263063, 0.92745487, 0.92178832, 0.91564423, 0.90903747, 0.90198439, 0.89450271, 0.88661141, 0.87833062],
           [0.89565597, 0.88996696, 0.88380385, 0.87717991, 0.87010995, 0.8626103 , 0.85469862, 0.84639383, 0.83771595, 0.82868601],
           [0.84794429, 0.8414176 , 0.83444842, 0.82705134, 0.81924245, 0.81103915, 0.8024601 , 0.79352503, 0.7842547 , 0.77467065],
           [0.79692339, 0.78974   , 0.78214742, 0.77416128, 0.76579857, 0.75707746, 0.74801726, 0.73863822, 0.72896144, 0.71900874],
           [0.7449893 , 0.73732055, 0.7292762 , 0.72087263, 0.71212741, 0.70305921, 0.69368768, 0.68403329, 0.67411725, 0.66396132],
           [0.69402415, 0.68602325, 0.67767956, 0.66900991, 0.66003222, 0.65076537, 0.6412291 , 0.63144388, 0.62143077, 0.61121128]])
    

    You can see that the values perfectly agree with your linked image.

    Now, I've got bad news and good news. The bad news is that while np.vectorize provides syntactical sugar around calling your scalar integration function with array inputs, it won't actually give you speed-up compared to a native for loop. The good news is that you can replace the calls to mpmath.exp with calls to np.exp and you'll end up with the same result much faster:

    def k_integrand_np(x, xl, xu):    
        return ((x**2)*np.exp(x))/((xu - xl)*(np.exp(x)-1)**2)
    
    @np.vectorize   
    def K_np(xl, xu):
        if xu <= xl:
            # don't even try to integrate
            return np.nan
        y, err = integrate.quad(k_integrand_np, xl, xu, args = (xl, xu))
        return y
    

    With these definitions

    In [14]: res_mpmath = K(grid_xl, grid_xu)
        ...: res_np = K_np(grid_xl, grid_xu)
        ...: inds = ~np.isnan(res_mpmath)
        ...: 
    
    In [15]: np.array_equal(res_mpmath[inds], res_np[inds])
    Out[15]: True
    
    In [16]: %timeit K(grid_xl, grid_xu)
    107 ms ± 521 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
    
    In [17]: %timeit K_np(grid_xl, grid_xu)
    7.26 ms ± 157 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    

    So the two methods give the same result (exactly!), but the numpy version is almost 15 times faster.

    0 讨论(0)
  • 2020-12-11 13:20

    Using a low-level callback function for integration

    The following answer is meant as a comment to @Andras Deak answer, but is to long for a comment.

    The scipy integrate functions call the k_integrand_np function multiple times, which is quite slow. The alternative of using a pure Python function is to use a low-level callback function. This function can be written directly in C or in Python using a compiler like Numba. The following is a slightly modified version of this answer.

    Example

    import time
    import numpy as np
    import numba
    import scipy.integrate as integrate
    from scipy import LowLevelCallable
    from numba import cfunc
    from numba.types import intc, CPointer, float64
    
    
    ##wrapper for a function that takes 3 input values
    def jit_integrand_function(integrand_function):
        jitted_function = numba.njit(integrand_function)
        
        @cfunc(float64(intc, CPointer(float64)))
        def wrapped(n, xx):
            return jitted_function(xx[0], xx[1],xx[2])
        return LowLevelCallable(wrapped.ctypes)
    
    #your function to integrate
    def k_integrand_np(x, xl, xu):
      return ((x**2)*np.exp(x))/((xu - xl)*(np.exp(x)-1)**2)
    
    #compile integrand
    k_integrand_nb=jit_integrand_function(k_integrand_np)
    
    #now we can use the low-level callable
    @np.vectorize
    def K_nb(xl, xu):
        if xu <= xl:
            # don't even try to integrate
            return np.nan
        y, err = integrate.quad(k_integrand_nb, xl, xu, args = (xl, xu))
        return y
    
    #for comparison
    @np.vectorize
    def K_np(xl, xu):
        if xu <= xl:
            # don't even try to integrate
            return np.nan
        y, err = integrate.quad(k_integrand_np, xl, xu, args = (xl, xu))
        return y
    

    Performance

    #create some data
    grid_xl = np.linspace(0.1,1,500)      
    grid_xu = np.linspace(0.5,4,800)[:,None] 
    
    t1=time.time()
    res_nb = K_nb(grid_xl, grid_xu)
    print(time.time()-t1)
    t1=time.time()
    res_np = K_np(grid_xl, grid_xu)
    print(time.time()-t1)
    
    inds = ~np.isnan(res_nb)
    np.allclose(res_nb[inds], res_np[inds])
    
    K_np: 24.58s
    K_nb: 0.97s (25x speedup)
    allclose: True
    
    0 讨论(0)
提交回复
热议问题