How to estimate the best fitting function to a scatter plot in R?

ε祈祈猫儿з 提交于 2019-12-03 08:47:13

Here is an example of comparing five models. Due to the form of the first two models we are able to use lm to get good starting values. (Note that models using different transforms of y should not be compared so we should not use lm1 and lm2 as comparison models but only for starting values.) Now run an nls for each of the first two. After these two models we try polynomials of various degrees in x. Fortunately lm and nls use consistent AIC definitions (although its not necessarily true that other R model fitting functions have consistent AIC definitions) so we can just use lm for the polynomials. Finally we plot the data and fits of the first two models.

The lower the AIC the better so nls1 is best followed by lm3.2 following by nls2 .

lm1 <- lm(1/y ~ x)
nls1 <- nls(y ~ 1/(a + b*x), start = setNames(coef(lm1), c("a", "b")))
AIC(nls1) # -2.390924

lm2 <- lm(1/y ~ log(x))
nls2 <- nls(y ~ 1/(a + b*log(x)), start = setNames(coef(lm2), c("a", "b")))
AIC(nls2) # -1.29101

lm3.1 <- lm(y ~ x) 
AIC(lm3.1) # 13.43161

lm3.2 <- lm(y ~ poly(x, 2))
AIC(lm3.2) # -1.525982

lm3.3 <- lm(y ~ poly(x, 3))
AIC(lm3.3) # 0.1498972

plot(y ~ x)

lines(fitted(nls1) ~ x, lty = 1) # solid line
lines(fitted(nls2) ~ x, lty = 2) # dashed line

ADDED a few more models and subsequently fixed them up and changed notation. Also to follow up on Ben Bolker's comment we can replace AIC everywhere above with AICc from the AICcmodavg package.

I would begin by an explantory plots, something like this :

x<-c(0.108,0.111,0.113,0.116,0.118,0.121,0.123,0.126,0.128,0.131,0.133,0.136)
y<-c(-6.908,-6.620,-5.681,-5.165,-4.690,-4.646,-3.979,-3.755,-3.564,-3.558,-3.272,-3.073)
dat <- data.frame(y=y,x=x)
library(latticeExtra)
library(grid)
xyplot(y ~ x,data=dat,par.settings = ggplot2like(),
       panel = function(x,y,...){
         panel.xyplot(x,y,...)
       })+
  layer(panel.smoother(y ~ x, method = "lm"), style =1)+  ## linear
  layer(panel.smoother(y ~ poly(x, 3), method = "lm"), style = 2)+  ## cubic
  layer(panel.smoother(y ~ x, span = 0.9),style=3)  + ### loeess
  layer(panel.smoother(y ~ log(x), method = "lm"), style = 4)  ## log

looks like you need a cubic model.

 summary(lm(y~poly(x,3),data=dat))

Residual standard error: 0.1966 on 8 degrees of freedom
Multiple R-squared: 0.9831, Adjusted R-squared: 0.9767 
F-statistic: 154.8 on 3 and 8 DF,  p-value: 2.013e-07 

You could start by reading the classic paper by Box and Cox on transformations. They discuss how to compare transformations and how to find meaningful transformations within a set or family of potential transforms. The log transform and linear model are special cases of the Box-Cox family.

And as @agstudy said, always plot the data as well.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!