Fitting polynomials to data

前端未结

关注

 10  2130

生来不讨喜 2020-11-28 01:39

Is there a way, given a set of values (x,f(x)), to find the polynomial of a given degree that best fits the data?

I know polynomial interpolation, whic

10条回答

粉色の甜心 (楼主)

2020-11-28 02:27
Thanks for everyone's replies. Here is another attempt at summarizing them. Pardon if I say too many "obvious" things: I knew nothing about least squares before, so everything was new to me.

NOT polynomial interpolation

Polynomial interpolation is fitting a polynomial of degree n given n+1 data points, e.g. finding a cubic that passes exactly through four given points. As said in the question, this was not want I wanted—I had a lot of points and wanted a small-degree polynomial (which will only approximately fit, unless we've been lucky)—but since some of the answers insisted on talking about it, I should mention them :) Lagrange polynomial, Vandermonde matrix, etc.

What is least-squares?

"Least squares" is a particular definition/criterion/"metric" of "how well" a polynomial fits. (There are others, but this is simplest.) Say you are trying to fit a polynomial p(x,y) = a + bx + cy + dx² + ey² + fxy to some given data points (x_i,y_i,Z_i) (where "Z_i" was "f(x_i,y_i)" in the question). With least-squares the problem is to find the "best" coefficients (a,b,c,d,e,f), such that what is minimized (kept "least") is the "sum of squared residuals", namely

S = ∑_i (a + bx_i + cy_i + dx_i² + ey_i² + fx_iy_i - Z_i)²

Theory

The important idea is that if you look at S as a function of (a,b,c,d,e,f), then S is minimized at a point at which its gradient is 0. This means that for example ∂S/∂f=0, i.e. that

∑_i2(a + … + fx_iy_i - Z_i)x_iy_i = 0

and similar equations for a, b, c, d, e. Note that these are just linear equations in a…f. So we can solve them with Gaussian elimination or any of the usual methods.

This is still called "linear least squares", because although the function we wanted was a quadratic polynomial, it is still linear in the parameters (a,b,c,d,e,f). Note that the same thing works when we want p(x,y) to be any "linear combination" of arbitrary functions f_j, instead of just a polynomial (= "linear combination of monomials").

Code

For the univariate case (when there is only variable x — the f_j are monomials x^j), there is Numpy's polyfit:
```
>>> import numpy
>>> xs = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> ys = [1.1, 3.9, 11.2, 21.5, 34.8, 51, 70.2, 92.3, 117.4, 145.5]
>>> p = numpy.poly1d(numpy.polyfit(xs, ys, deg=2))
>>> print p
       2
1.517 x + 2.483 x + 0.4927
```
For the multivariate case, or linear least squares in general, there is SciPy. As explained in its documentation, it takes a matrix A of the values f_j(x_i). (The theory is that it finds the Moore-Penrose pseudoinverse of A.) With our above example involving (x_i,y_i,Z_i), fitting a polynomial means the f_j are the monomials x⁽⁾y⁽⁾. The following finds the best quadratic (or best polynomial of any other degree, if you change the "degree = 2" line):
```
from scipy import linalg
import random

n = 20
x = [100*random.random() for i in range(n)]
y = [100*random.random() for i in range(n)]
Z = [(x[i]+y[i])**2 + 0.01*random.random() for i in range(n)]

degree = 2
A = []
for i in range(n):
    A.append([])
    for xd in range(degree+1):
        for yd in range(degree+1-xd):
            A[i].append((x[i]**xd)*(y[i]**yd)) #f_j(x_i)

c,_,_,_ = linalg.lstsq(A,Z)
j = 0
for xd in range(0,degree+1):
    for yd in range(0,degree+1-xd):
        print " + (%.2f)x^%dy^%d" % (c[j], xd, yd),
        j += 1
```
prints
```
 + (0.01)x^0y^0  + (-0.00)x^0y^1  + (1.00)x^0y^2  + (-0.00)x^1y^0  + (2.00)x^1y^1  + (1.00)x^2y^0
```
so it has discovered that the polynomial is x²+2xy+y²+0.01. [The last term is sometimes -0.01 and sometimes 0, which is to be expected because of the random noise we added.]

Alternatives to Python+Numpy/Scipy are R and Computer Algebra Systems: Sage, Mathematica, Matlab, Maple. Even Excel might be able to do it. Numerical Recipes discusses methods to implement it ourselves (in C, Fortran).

Concerns
- It is strongly influenced by how the points are chosen. When I had x=y=range(20) instead of the random points, it always produced 1.33x²+1.33xy+1.33y², which was puzzling... until I realised that because I always had x[i]=y[i], the polynomials were the same: x²+2xy+y² = 4x² = (4/3)(x²+xy+y²). So the moral is that it is important to choose the points carefully to get the "right" polynomial. (If you can chose, you should choose Chebyshev nodes for polynomial interpolation; not sure if the same is true for least squares as well.)
- Overfitting: higher-degree polynomials can always fit the data better. If you change the degree to 3 or 4 or 5, it still mostly recognizes the same quadratic polynomial (coefficients are 0 for higher-degree terms) but for larger degrees, it starts fitting higher-degree polynomials. But even with degree 6, taking larger n (more data points instead of 20, say 200) still fits the quadratic polynomial. So the moral is to avoid overfitting, for which it might help to take as many data points as possible.
- There might be issues of numerical stability I don't fully understand.
- If you don't need a polynomial, you can obtain better fits with other kinds of functions, e.g. splines (piecewise polynomials).
0 讨论(0)

查看其它10个回答
发布评论:

提交评论
- 加载中...

Fitting polynomials to data

NOT polynomial interpolation

What is least-squares?

Theory

Code

Concerns