Conditionally colour data points outside of confidence bands in R

后端未结

关注

 3  1434

梦毁少年i 2021-02-03 14:13

I need to colour datapoints that are outside of the the confidence bands on the plot below differently from those within the bands. Should I add a separate column to my dataset

3条回答

既然无缘 (楼主)

2021-02-03 14:34

Well, I thought that this would be pretty easy with ggplot2, but now I realize that I have no idea how the confidence limits for stat_smooth/geom_smooth are calculated.

Consider the following:

library(ggplot2)
pred <- as.data.frame(predict(severity.lm,level=0.95,interval="confidence"))
dat <- data.frame(diseasesev,temperature, 
    in_interval = diseasesev <=pred$upr & diseasesev >=pred$lwr ,pred)
ggplot(dat,aes(y=diseasesev,x=temperature)) +
stat_smooth(method='lm')  + geom_point(aes(colour=in_interval)) +
    geom_line(aes(y=lwr),colour=I('red')) + geom_line(aes(y=upr),colour=I('red'))

This produces: alt text http://ifellows.ucsd.edu/pmwiki/uploads/Main/strangeplot.jpg

I don't understand why the confidence band calculated by stat_smooth is inconsistent with the band calculated directly from predict (i.e. the red lines). Can anyone shed some light on this?

Edit:

figured it out. ggplot2 uses 1.96 * standard error to draw the intervals for all smoothing methods.

pred <- as.data.frame(predict(severity.lm,se.fit=TRUE,
        level=0.95,interval="confidence"))
dat <- data.frame(diseasesev,temperature, 
    in_interval = diseasesev <=pred$fit.upr & diseasesev >=pred$fit.lwr ,pred)
ggplot(dat,aes(y=diseasesev,x=temperature)) +
    stat_smooth(method='lm')  + 
    geom_point(aes(colour=in_interval)) +
    geom_line(aes(y=fit.lwr),colour=I('red')) + 
    geom_line(aes(y=fit.upr),colour=I('red')) +
    geom_line(aes(y=fit.fit-1.96*se.fit),colour=I('green')) + 
    geom_line(aes(y=fit.fit+1.96*se.fit),colour=I('green'))

0 讨论(0)

查看其它3个回答