Multithreaded calls to the objective function of scipy.optimize.leastsq

后端未结

关注

 5  393

I\'m using scipy.optimize.leastsq in conjunction with a simulator. leastsq calls a user-defined objective function and passes an input vector to it

相关标签:

5条回答

你的背包

2020-12-29 00:54
The algorithm used by leastsq, Levenberg-Marquardt, needs to know the value of the objective function at the current point before determining the next point. In short, there is no straightforward way to parallelize such a serial algorithm.

You can, however, parallelize your objective function in some cases. This can be done, if it's of the form:
```
def objective_f(params):
    r = np.zeros([200], float)
    for j in range(200):
        r[j] = run_simulation(j, params)
    return

def run_simulation(j, params):
    r1 = ... compute j-th entry of the result ...
    return r1
```
Here, you can clearly parallelize across the loop over j, for instance using the multiprocessing module. Something like this: (untested)
```
def objective_f(params):
    r = np.zeros([200], float)
    def parameters():
        for j in range(200):
            yield j, params
    pool = multiprocessing.Pool()
    r[:] = pool.map(run_simulation, parameters())
    return r
```
Another opportunity for parallelization occurs if you have to fit multiple data sets --- this is an (embarassingly) parallel problem, and the different data sets can be fitted in parallel.

If this does not help, you can look into discussion on parallelization of the LM algorithm in the literature. For instance: http://dl.acm.org/citation.cfm?id=1542338 The main optimization suggested in this paper seems to be parallelization of the numerical computation of the Jacobian. You can do this by supplying your own parallelized Jacobian function to leastsq. The remaining suggestion of the paper, speculatively parallelizing Levenberg-Marquardt search steps, is however more difficult to implement and requires changes in the LM algorithm.

I'm not aware of Python (or other language) libraries implementing optimization algorithms targeted for parallel computation, although there may be some. If you manage to implement/find one of them, please advertise this on the Scipy users mailing list --- there is certainly interest in one of these!
0 讨论(0)
发布评论:

提交评论
- 加载中...
心在旅途

2020-12-29 01:01

NumPy/SciPy's functions are usually optimized for multithreading. Did you look at your CPU utilization to confirm that only one core is being used while the simulation is being ran? Otherwise you have nothing to gain from running multiple instances.

If it is, in fact, single threaded, then your best option is to employ the multiprocessing module. It runs several instances of the Python interpreter so you can make several simultaneous calls to SciPy.

0 讨论(0)
发布评论:

提交评论
- 加载中...
北恋

2020-12-29 01:08

Have you used scipy.least_squares, it is a much better option, and when I use it to optimize a function it uses all the available threads. Therefore exactly what you asked

0 讨论(0)
发布评论:

提交评论
- 加载中...

鱼传尺愫

2020-12-29 01:10

There's a good opportunity to speed up leastsq by supplying your own function to calculate the derivatives (the Dfun parameter), providing you have several parameters. If this function is not supplied, leastsq iterates over each of the parameters to calculate the derivative each time, which is time consuming. This appears to take the majority of the time in the fitting.

You can use your own Dfun function which calculates the derivatives for each parameter using a multiprocessing.Pool to do the work. These derivatives can be calculated independently and should be trivially parallelised.

Here is a rough example, showing how to do this:

import numpy as np
import multiprocessing
import scipy.optimize

def calcmod(params):
    """Return the model."""
    return func(params)

def delta(params):
    """Difference between model and data."""
    return calcmod(params) - y

pool = multiprocessing.Pool(4)

def Dfun(params):
    """Calculate derivatives for each parameter using pool."""
    zeropred = calcmod(params)

    derivparams = []
    delta = 1e-4
    for i in range(len(params)):
        copy = np.array(params)
        copy[i] += delta
        derivparams.append(copy)

    results = pool.map(calcmod, derivparams)
    derivs = [ (r - zeropred)/delta for r in results ]
    return derivs

retn = scipy.optimize.leastsq(leastfuncall, inputparams, gtol=0.01,
                              Dfun=Dfun, col_deriv=1)

0 讨论(0)

谎友^

2020-12-29 01:10

Does this help? http://docs.python.org/library/multiprocessing.html

I've always found Pool to be the simplest to multiprocess with python.

0 讨论(0)
发布评论:

提交评论
- 加载中...