ggplot2: geom_ribbon with alpha dependent on data density along y-axis for each x

只愿长相守 提交于 2019-12-01 17:34:52

One option is to use quantile regression to get the y-values for each quantile at each x-value and then plot those using geom_ribbon.

library(splines)
library(quantreg)
library(reshape2)
library(dplyr)
  1. Set quantiles for density ribbons:

    nq = 50 # Numbre of quantiles
    qq = seq(0,1, length.out=nq)
    
  2. Run the quantile regression for each quantile. I've used a flexible spline function to get a good fit to the sine function:

    m1 = rq(y ~ ns(x,10), data=sim.df, tau=qq)
    
  3. Create data frame for use by geom_ribbon to plot density quantiles.

    Create a data frame of regression quantile predictions using predict:

    xvals = seq(min(sim.df$x), max(sim.df$x), length.out=100)
    rqs = data.frame(x=xvals, predict(m1, newdata=data.frame(x=xvals)))
    names(rqs) = c("x", paste0("p",100*qq))
    

    Reshape the data so that the predictions for each quantile serve as the ymax for one quantile and the ymin for the next quantile in succession (with the exception that the first quantile serves only once as the first ymin and the last quantile serves only once as the last ymax). Put the data in long format so that we can group by quantile in ggplot:

    dat1 = rqs[, -length(rqs)]
    names(dat1)[-1] = paste0(names(dat1)[-1])
    dat2 = rqs[, -2]
    names(dat2)[-1] = paste0(names(dat1)[-1])
    
    dat1 = melt(dat1, id.var="x")
    names(dat1) = c("x","group","min")
    dat2 = melt(dat2, id.var="x")
    names(dat2) = c("x","group1","max")
    
    dat = bind_cols(dat1, dat2)
    
  4. Now create the plot. We map the quantiles to the alpha aesthetic, and then use scale_alpha_manual to set the alpha values to be higher for quantiles closer to 0.5 and lower for quantiles closer to 0 and 1:

    ggplot() +
      geom_point(data=sim.df, aes(x,y), alpha=0.1, size=0.5, colour="red") +
      geom_ribbon(data=dat, aes(x=x, ymin=min, ymax=max, group=group, alpha=group), 
              fill="blue", lwd=0, show.legend=FALSE) +
      theme_bw() +
      scale_alpha_manual(values=c(seq(0.05,0.9,length.out=floor(0.5*length(qq))),
                                  seq(0.9,0.05,length.out=floor(0.5*length(qq)))))
    

Here's another example, but with data that has a varying standard deviation:

sim <- lapply(1:num_samples, function(f) {
  x = runif(num_points, 0, 4*pi)
  y = sin(x) + rnorm(num_points, 0, abs(0.7*cos(x))+0.1)
  data.frame(x=x, y=y)
})

sim.df <- do.call(rbind, sim)

Now just run all of the code we created earlier to get this plot:

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!