Multithreaded calls to the objective function of scipy.optimize.leastsq

后端 未结 5 387
孤街浪徒
孤街浪徒 2020-12-29 00:47

I\'m using scipy.optimize.leastsq in conjunction with a simulator. leastsq calls a user-defined objective function and passes an input vector to it

相关标签:
5条回答
  • 2020-12-29 00:54

    The algorithm used by leastsq, Levenberg-Marquardt, needs to know the value of the objective function at the current point before determining the next point. In short, there is no straightforward way to parallelize such a serial algorithm.

    You can, however, parallelize your objective function in some cases. This can be done, if it's of the form:

    def objective_f(params):
        r = np.zeros([200], float)
        for j in range(200):
            r[j] = run_simulation(j, params)
        return
    
    def run_simulation(j, params):
        r1 = ... compute j-th entry of the result ...
        return r1
    

    Here, you can clearly parallelize across the loop over j, for instance using the multiprocessing module. Something like this: (untested)

    def objective_f(params):
        r = np.zeros([200], float)
        def parameters():
            for j in range(200):
                yield j, params
        pool = multiprocessing.Pool()
        r[:] = pool.map(run_simulation, parameters())
        return r
    

    Another opportunity for parallelization occurs if you have to fit multiple data sets --- this is an (embarassingly) parallel problem, and the different data sets can be fitted in parallel.

    If this does not help, you can look into discussion on parallelization of the LM algorithm in the literature. For instance: http://dl.acm.org/citation.cfm?id=1542338 The main optimization suggested in this paper seems to be parallelization of the numerical computation of the Jacobian. You can do this by supplying your own parallelized Jacobian function to leastsq. The remaining suggestion of the paper, speculatively parallelizing Levenberg-Marquardt search steps, is however more difficult to implement and requires changes in the LM algorithm.

    I'm not aware of Python (or other language) libraries implementing optimization algorithms targeted for parallel computation, although there may be some. If you manage to implement/find one of them, please advertise this on the Scipy users mailing list --- there is certainly interest in one of these!

    0 讨论(0)
  • 2020-12-29 01:01

    NumPy/SciPy's functions are usually optimized for multithreading. Did you look at your CPU utilization to confirm that only one core is being used while the simulation is being ran? Otherwise you have nothing to gain from running multiple instances.

    If it is, in fact, single threaded, then your best option is to employ the multiprocessing module. It runs several instances of the Python interpreter so you can make several simultaneous calls to SciPy.

    0 讨论(0)
  • 2020-12-29 01:08

    Have you used scipy.least_squares, it is a much better option, and when I use it to optimize a function it uses all the available threads. Therefore exactly what you asked

    0 讨论(0)
  • 2020-12-29 01:10

    There's a good opportunity to speed up leastsq by supplying your own function to calculate the derivatives (the Dfun parameter), providing you have several parameters. If this function is not supplied, leastsq iterates over each of the parameters to calculate the derivative each time, which is time consuming. This appears to take the majority of the time in the fitting.

    You can use your own Dfun function which calculates the derivatives for each parameter using a multiprocessing.Pool to do the work. These derivatives can be calculated independently and should be trivially parallelised.

    Here is a rough example, showing how to do this:

    import numpy as np
    import multiprocessing
    import scipy.optimize
    
    def calcmod(params):
        """Return the model."""
        return func(params)
    
    def delta(params):
        """Difference between model and data."""
        return calcmod(params) - y
    
    pool = multiprocessing.Pool(4)
    
    def Dfun(params):
        """Calculate derivatives for each parameter using pool."""
        zeropred = calcmod(params)
    
        derivparams = []
        delta = 1e-4
        for i in range(len(params)):
            copy = np.array(params)
            copy[i] += delta
            derivparams.append(copy)
    
        results = pool.map(calcmod, derivparams)
        derivs = [ (r - zeropred)/delta for r in results ]
        return derivs
    
    retn = scipy.optimize.leastsq(leastfuncall, inputparams, gtol=0.01,
                                  Dfun=Dfun, col_deriv=1)
    
    0 讨论(0)
  • 2020-12-29 01:10

    Does this help? http://docs.python.org/library/multiprocessing.html

    I've always found Pool to be the simplest to multiprocess with python.

    0 讨论(0)
提交回复
热议问题