Fitting polynomials to data

前端 未结 10 2131
生来不讨喜
生来不讨喜 2020-11-28 01:39

Is there a way, given a set of values (x,f(x)), to find the polynomial of a given degree that best fits the data?

I know polynomial interpolation, whic

相关标签:
10条回答
  • 2020-11-28 02:10

    Bare in mind that a polynomial of higher degree ALWAYS fits the data better. Polynomials of higher degree typically leads to highly improbable functions (see Occam's Razor), though (overfitting). You want to find a balance between simplicity (degree of polynomial) and fit (e.g. least square error). Quantitatively, there are tests for this, the Akaike Information Criterion or the Bayesian Information Criterion. These tests give a score which model is to be prefered.

    0 讨论(0)
  • 2020-11-28 02:10

    at college we had this book which I still find extremely useful: Conte, de Boor; elementary numerical analysis; Mc Grow Hill. The relevant paragraph is 6.2: Data Fitting.
    example code comes in FORTRAN, and the listings are not very readable either, but the explanations are deep and clear at the same time. you end up understanding what you are doing, not just doing it (as is my experience of Numerical Recipes).
    I usually start with Numerical Recipes but for things like this I quickly have to grab Conte-de Boor.

    maybe better posting some code... it's a bit stripped down, but the most relevant parts are there. it relies on numpy, obviously!

    def Tn(n, x):
      if n==0:
        return 1.0
      elif n==1:
        return float(x)
      else:
        return (2.0 * x * Tn(n - 1, x)) - Tn(n - 2, x)
    
    class ChebyshevFit:
    
      def __init__(self):
        self.Tn = Memoize(Tn)
    
      def fit(self, data, degree=None):
        """fit the data by a 'minimal squares' linear combination of chebyshev polinomials.
    
        cfr: Conte, de Boor; elementary numerical analysis; Mc Grow Hill (6.2: Data Fitting)
        """
    
        if degree is None:
          degree = 5
    
        data = sorted(data)
        self.range = start, end = (min(data)[0], max(data)[0])
        self.halfwidth = (end - start) / 2.0
        vec_x = [(x - start - self.halfwidth)/self.halfwidth for (x, y) in data]
        vec_f = [y for (x, y) in data]
    
        mat_phi = [numpy.array([self.Tn(i, x) for x in vec_x]) for i in range(degree+1)]
        mat_A = numpy.inner(mat_phi, mat_phi)
        vec_b = numpy.inner(vec_f, mat_phi)
    
        self.coefficients = numpy.linalg.solve(mat_A, vec_b)
        self.degree = degree
    
      def evaluate(self, x):
        """use Clenshaw algorithm
    
        http://en.wikipedia.org/wiki/Clenshaw_algorithm
        """
    
        x = (x-self.range[0]-self.halfwidth) / self.halfwidth
    
        b_2 = float(self.coefficients[self.degree])
        b_1 = 2 * x * b_2 + float(self.coefficients[self.degree - 1])
    
        for i in range(2, self.degree):
          b_1, b_2 = 2.0 * x * b_1 + self.coefficients[self.degree - i] - b_2, b_1
        else:
          b_0 = x*b_1 + self.coefficients[0] - b_2
    
        return b_0
    
    0 讨论(0)
  • 2020-11-28 02:12

    The lagrange polynomial is in some sense the "simplest" interpolating polynomial that fits a given set of data points.

    It is sometimes problematic because it can vary wildly between data points.

    0 讨论(0)
  • 2020-11-28 02:15

    Yes, the way this is typically done is by using least squares. There are other ways of specifying how well a polynomial fits, but the theory is simplest for least squares. The general theory is called linear regression.

    Your best bet is probably to start with Numerical Recipes.

    R is free and will do everything you want and more, but it has a big learning curve.

    If you have access to Mathematica, you can use the Fit function to do a least squares fit. I imagine Matlab and its open source counterpart Octave have a similar function.

    0 讨论(0)
  • 2020-11-28 02:19

    Lagrange polynomials (as @j w posted) give you an exact fit at the points you specify, but with polynomials of degree more than say 5 or 6 you can run into numerical instability.

    Least squares gives you the "best fit" polynomial with error defined as the sum of squares of the individual errors. (take the distance along the y-axis between the points you have and the function that results, square them, and sum them up) The MATLAB polyfit function does this, and with multiple return arguments, you can have it automatically take care of scaling/offset issues (e.g. if you have 100 points all between x=312.1 and 312.3, and you want a 6th degree polynomial, you're going to want to calculate u = (x-312.2)/0.1 so the u-values are distributed between -1 and +=).

    NOTE that the results of least-squares fits are strongly influenced by the distribution of x-axis values. If the x-values are equally spaced, then you'll get larger errors at the ends. If you have a case where you can choose the x values and you care about the maximum deviation from your known function and an interpolating polynomial, then the use of Chebyshev polynomials will give you something that is close to the perfect minimax polynomial (which is very hard to calculate). This is discussed at some length in Numerical Recipes.

    Edit: From what I gather, this all works well for functions of one variable. For multivariate functions it is likely to be much more difficult if the degree is more than, say, 2. I did find a reference on Google Books.

    0 讨论(0)
  • 2020-11-28 02:19

    It's rather easy to scare up a quick fit using Excel's matrix functions if you know how to represent the least squares problem as a linear algebra problem. (That depends on how reliable you think Excel is as a linear algebra solver.)

    0 讨论(0)
提交回复
热议问题