Minimizing functions with large gradients using `scipy.optimize.minimize`

拟墨画扇 提交于 2019-12-06 13:14:34

For large non-linear optimization problems typically one would pay attention to (at least) four things:

  1. Scaling
  2. Initial values
  3. Bounds
  4. Precise gradients and if possible second derivatives (for complex problems use a modeling system that allows automatic differentiation)

Some more advanced solvers may provide some support for automatic scaling. However scaling for non-linear problems is not that easy as the Jacobian will change (some strategies that are typically available are: scale only the linear part, scale linear + nonlinear part once at the beginning based on initial values, or rescale the problem during the iteration process). Linear solvers have an easier job in this respect (Jacobian is constant so we can scale once at the beginning). Scipy.optimize.minimize is not the most advanced so I would encourage you to scale things yourself (typically you can only do this once before the solver starts; in some cases you may even stop the solver to rescale and then call the solver again using the last point as initial value -- this sounds crazy but this trick helped me a few times). A good initial point and good bounds can also help in this respect (in order to keep the solver in reasonable regions where functions and gradients can be reliably evaluated). Finally sometimes model reformulations can help in providing better scaling (replace division by multiplication, taking logs etc).

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!