I am using scipy.optimize.fmin_l_bfgs_b to solve a gaussian mixture problem. The means of mixture distributions are modeled by regressions whose weights have to be optimized
As pointed out in the answer by Wilmer E. Henao, the problem is probably in the gradient. Since you are using approx_grad=True, the gradient is calculated numerically. In this case, reducing the value of epsilon, which is the step size used for numerically calculating the gradient, can help.