Why does scipy.optimize.curve_fit not produce a line of best fit for my points?

问题

I have a set of data points, (x and y in the code below) and I am trying to create a linear line of best fit through my points. I am using scipy.optimize.curve_fit. My code produces a line, but not a line of best fit. I have tried giving the function model parameters to use for my gradient and for my intercept, but each time it produces the exact same line which does not fit to my data points.

The blue dots are my data points the red line should be fitted to:

If anyone could point out where I am going wrong I would be extremely grateful:

import numpy as np
import matplotlib.pyplot as mpl
import scipy as sp
import scipy.optimize as opt

x=[1.0,2.5,3.5,4.0,1.1,1.8,2.2,3.7]
y=[6.008,15.722,27.130,33.772,5.257,9.549,11.098,28.828]
trialX = np.linspace(1.0,4.0,1000)                         #Trial values of x

def f(x,m,c):                                        #Defining the function y(x)=(m*x)+c
    return (x*m)+c

popt,pcov=opt.curve_fit(f,x,y)                       #Returning popt and pcov
ynew=f(trialX,*popt)                                                  

mpl.plot(x,y,'bo')
mpl.plot(trialX,ynew,'r-')
mpl.show()

回答1:

You could alternatively use numpy.polyfit to get the line of best fit:

import numpy as np
import matplotlib.pyplot as mpl

x=[1.0,2.5,3.5,4.0,1.1,1.8,2.2,3.7]
y=[6.008,15.722,27.130,33.772,5.257,9.549,11.098,28.828]
trialX = np.linspace(1.0,4.0,1000)                         #Trial values of x

#get the first order coefficients 
fit = np.polyfit(x, y, 1)

#apply 
ynew = trialX * fit[0] + fit[1]                                              

mpl.plot(x,y,'bo')
mpl.plot(trialX,ynew,'r-')
mpl.show()

Here is the output:

回答2:

EDIT: This behavior has now been patched in the current version of scipy to make .curve_fit a bit more foolproof:

https://github.com/scipy/scipy/issues/3037

For some reason, .curve_fit really wants the input to be a numpy array and will give you erroneous results if you pass it a regular list (IMHO this is unexpected behavior and may be a bug). Change the definition of x to:

x=np.array([1.0,2.5,3.5,4.0,1.1,1.8,2.2,3.7])

And you get:

I'm guessing that the happens since m*x where m an integer and x is a list will produce m copies of that list, clearly not the result you were looking for!

来源：https://stackoverflow.com/questions/19713689/why-does-scipy-optimize-curve-fit-not-produce-a-line-of-best-fit-for-my-points

标签

python

optimization

scipy

curve-fitting