Confidence interval for LOWESS in Python

问题

How would I calculate the confidence intervals for a LOWESS regression in Python? I would like to add these as a shaded region to the LOESS plot created with the following code (other packages than statsmodels are fine as well).

import numpy as np
import pylab as plt
import statsmodels.api as sm

x = np.linspace(0,2*np.pi,100)
y = np.sin(x) + np.random.random(100) * 0.2
lowess = sm.nonparametric.lowess(y, x, frac=0.1)

plt.plot(x, y, '+')
plt.plot(lowess[:, 0], lowess[:, 1])
plt.show()

I've added an example plot with confidence interval below from the webblog Serious Stats (it is created using ggplot in R).

回答1:

LOESS doesn't have an explicit concept for standard error. It just doesn't mean anything in this context. Since that's out, your stuck with the brute-force approach.

Bootstrap your data. Your going to fit a LOESS curve to the bootstrapped data. See the middle of this page to find a pretty picture of what your doing. http://statweb.stanford.edu/~susan/courses/s208/node20.html

Once you have your large number of different LOESS curves, you can find the top and bottom Xth percentile.

回答2:

This is a very old question but it's one of the first that pops up on google search. You can do this using the loess() function from scikit-misc. Here's an example (I tried to keep your original variable names, but I bumped up the noise a bit to make it more visible)

import numpy as np
import pylab as plt
from skmisc.loess import loess

x = np.linspace(0,2*np.pi,100)
y = np.sin(x) + np.random.random(100) * 0.4

l = loess(x,y)
l.fit()
pred = l.predict(x, stderror=True)
conf = pred.confidence()

lowess = pred.values
ll = conf.lower
ul = conf.upper

plt.plot(x, y, '+')
plt.plot(x, lowess)
plt.fill_between(x,ll,ul,alpha=.33)
plt.show()

result:

来源：https://stackoverflow.com/questions/31104565/confidence-interval-for-lowess-in-python

标签

python

statsmodels

loess