Conditionally colour data points outside of confidence bands in R

后端 未结 3 1414
梦毁少年i
梦毁少年i 2021-02-03 14:13

I need to colour datapoints that are outside of the the confidence bands on the plot below differently from those within the bands. Should I add a separate column to my dataset

3条回答
  •  既然无缘
    2021-02-03 14:34

    Well, I thought that this would be pretty easy with ggplot2, but now I realize that I have no idea how the confidence limits for stat_smooth/geom_smooth are calculated.

    Consider the following:

    library(ggplot2)
    pred <- as.data.frame(predict(severity.lm,level=0.95,interval="confidence"))
    dat <- data.frame(diseasesev,temperature, 
        in_interval = diseasesev <=pred$upr & diseasesev >=pred$lwr ,pred)
    ggplot(dat,aes(y=diseasesev,x=temperature)) +
    stat_smooth(method='lm')  + geom_point(aes(colour=in_interval)) +
        geom_line(aes(y=lwr),colour=I('red')) + geom_line(aes(y=upr),colour=I('red'))
    

    This produces: alt text http://ifellows.ucsd.edu/pmwiki/uploads/Main/strangeplot.jpg

    I don't understand why the confidence band calculated by stat_smooth is inconsistent with the band calculated directly from predict (i.e. the red lines). Can anyone shed some light on this?

    Edit:

    figured it out. ggplot2 uses 1.96 * standard error to draw the intervals for all smoothing methods.

    pred <- as.data.frame(predict(severity.lm,se.fit=TRUE,
            level=0.95,interval="confidence"))
    dat <- data.frame(diseasesev,temperature, 
        in_interval = diseasesev <=pred$fit.upr & diseasesev >=pred$fit.lwr ,pred)
    ggplot(dat,aes(y=diseasesev,x=temperature)) +
        stat_smooth(method='lm')  + 
        geom_point(aes(colour=in_interval)) +
        geom_line(aes(y=fit.lwr),colour=I('red')) + 
        geom_line(aes(y=fit.upr),colour=I('red')) +
        geom_line(aes(y=fit.fit-1.96*se.fit),colour=I('green')) + 
        geom_line(aes(y=fit.fit+1.96*se.fit),colour=I('green'))
    

提交回复
热议问题