scipy.optimize.fmin_bfgs single function computes both f and fprime

问题

I'm using scipy.optimize.fmin_bfgs(f, init_theta, fprime) to minimize f, which has gradient fprime. I compute f and fprime in a single function because most of the computation is the same so there's no need to do it twice.

Is there any way to call fmin_bfgs() specifying a single function that returns both f and fprime?

回答1:

If you're trying to save on computation time rather than just combine the calculation of f and f' for code convenience, it seems like you need an extra wrapper around your function to cache values, since fmin_bfgs doesn't seem to allow you to pass such a function (unlike some other optimization functions).

Here's one way to do that, maintaining the 10 most recently evaluated points in a little cache. (I'm not sure whether calls to this function need to be thread-safe: probably not, but if so, you'll probably need to add some locking in here, I guess.)

def func_wrapper(f, cache_size=10):
    evals = {}
    last_points = collections.deque()

    def get(pt, which):
        s = pt.tostring() # get binary string of numpy array, to make it hashable
        if s not in evals:
            evals[s] = f(pt)
            last_points.append(s)
            if len(last_points) >= cache_size:
                del evals[last_points.popleft()]
        return evals[s][which]

    return functools.partial(get, which=0), functools.partial(get, which=1)

If we then do

>>> def f(x):
...    print "evaluating", x
...    return (x-3)**2, 2*(x-3)

>>> f_, fprime = func_wrapper(f)

>>> optimize.fmin_bfgs(f_, 1000, fprime)
evaluating [ 994.93480441]
evaluating [ 974.67402207]
evaluating [ 893.63089268]
evaluating [ 665.93446894]
evaluating [ 126.99931561]
evaluating [ 3.]
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 4
         Function evaluations: 7
         Gradient evaluations: 7
array([ 3.])

we can see that we don't repeat any evaluations.

回答2:

Suppose you have a Python function f(x) that returns both the function value and the gradient:

In [20]: def f(x):
   ....:     return (x-3)**2, 2*(x-3)

Then just pass the outputs separately like so:

In [21]: optimize.fmin_bfgs(lambda x: f(x)[0], 1000, lambda x: f(x)[1])
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 4
         Function evaluations: 7
         Gradient evaluations: 7
Out[21]: array([ 3.])

回答3:

There doesn't seem to be a way to do this directly. But scipy.optimize.minimize does let you do this. You can pass a the value True for fprime instead of a function. This signals that f returns a tuple of the function value and the gradient. You can invoke minimize with method='BFGS' to get the effect you want.

It's enlightening to look at the source code for minimize. Both it and fmin_bfgs eventually call _minimize_bfgs, which takes f and fprime as separate function arguments. When fprime is a boolean,minimize cleverly constructs fprime as an object that remembers the last value returned by f, and caches the gradient information.

来源：https://stackoverflow.com/questions/10712789/scipy-optimize-fmin-bfgs-single-function-computes-both-f-and-fprime

标签

python

optimization

scipy

gradient