How to plot a contour line showing where 95% of values fall within, in R and in ggplot2

前端 未结 5 1561
长情又很酷
长情又很酷 2020-12-05 15:21

Say we have:

x <- rnorm(1000)
y <- rnorm(1000)

How do I use ggplot2 to produce a plot containing the two following geoms:

    <
5条回答
  •  隐瞒了意图╮
    2020-12-05 16:02

    I had an example where the MASS::kde2d() bandwidth specifications were not flexible enough, so I ended up using the ks package and the ks::kde() function and, as an example, the ks::Hscv() function to estimate flexible bandwidths that captured the smoothness better. This computation can be a bit slow, but it has much better performance in some situations. Here is a version of the above code for that example:

    set.seed(1001)
    d <- data.frame(x=rnorm(1000),y=rnorm(1000))
    getLevel <- function(x,y,prob=0.95) {
        kk <- MASS::kde2d(x,y)
        dx <- diff(kk$x[1:2])
        dy <- diff(kk$y[1:2])
        sz <- sort(kk$z)
        c1 <- cumsum(sz) * dx * dy
        approx(c1, sz, xout = 1 - prob)$y
    }
    L95 <- getLevel(d$x,d$y)
    library(ggplot2); theme_set(theme_bw())
    ggplot(d,aes(x,y)) +
        stat_density2d(geom="tile", aes(fill = ..density..),
                       contour = FALSE)+
        stat_density2d(colour="red",breaks=L95)
    
    ## using ks::kde
    hscv1 <- Hscv(d)
    fhat <- ks::kde(d, H=hscv1, compute.cont=TRUE)
    
    dimnames(fhat[['estimate']]) <- list(fhat[["eval.points"]][[1]], 
                                         fhat[["eval.points"]][[2]])
    library(reshape2)
    aa <- melt(fhat[['estimate']])
    
    ggplot(aa, aes(x=Var1, y=Var2)) +
        geom_tile(aes(fill=value)) +
        geom_contour(aes(z=value), breaks=fhat[["cont"]]["50%"], color="red") +
        geom_contour(aes(z=value), breaks=fhat[["cont"]]["5%"], color="purple") 
    

    For this particular example, the differences are minimal, but in an example where the bandwidth specification requires more flexibility, this modification may be important. Note that the 95% contour is specified using the breaks=fhat[["cont"]]["5%"], which I found a little bit counter-intuitive, because it is called here the "5% contour".

提交回复
热议问题