R + ggplot : Time series with events

前端 未结 3 614
梦如初夏
梦如初夏 2020-12-22 15:26

I\'m an R/ggplot newbie. I would like to create a geom_line plot of a continuous variable time series and then add a layer composed of events. The continuous variable and it

3条回答
  •  既然无缘
    2020-12-22 15:41

    As much as I like @JD Long's answer, I'll put one that is just in R/ggplot2.

    The approach is to create a second data set of events and to use that to determine positions. Starting with what @Angelo had:

    library(ggplot2)
    data(presidential)
    data(economics)
    

    Pull out the event (presidential) data, and transform it. Compute baseline and offset as fractions of the economic data it will be plotted with. Set the bottom (ymin) to the baseline. This is where the tricky part comes. We need to be able to stagger labels if they are too close together. So determine the spacing between adjacent labels (assumes that the events are sorted). If it is less than some amount (I picked about 4 years for this scale of data), then note that that label needs to be higher. But it has to be higher than the one after it, so use rle to get the length of TRUE's (that is, must be higher) and compute an offset vector using that (each string of TRUE must count down from its length to 2, the FALSEs are just at an offset of 1). Use this to determine the top of the bars (ymax).

    events <- presidential[-(1:3),]
    baseline = min(economics$unemploy)
    delta = 0.05 * diff(range(economics$unemploy))
    events$ymin = baseline
    events$timelapse = c(diff(events$start),Inf)
    events$bump = events$timelapse < 4*370 # ~4 years
    offsets <- rle(events$bump)
    events$offset <- unlist(mapply(function(l,v) {if(v){(l:1)+1}else{rep(1,l)}}, l=offsets$lengths, v=offsets$values, USE.NAMES=FALSE))
    events$ymax <- events$ymin + events$offset * delta
    

    Putting this together into a plot:

    ggplot() +
        geom_line(mapping=aes(x=date, y=unemploy), data=economics , size=3, alpha=0.5) +
        geom_segment(data = events, mapping=aes(x=start, y=ymin, xend=start, yend=ymax)) +
        geom_point(data = events, mapping=aes(x=start,y=ymax), size=3) +
        geom_text(data = events, mapping=aes(x=start, y=ymax, label=name), hjust=-0.1, vjust=0.1, size=6) +
        scale_x_date("time") +  
        scale_y_continuous(name="unemployed \[1000's\]")
    

    You could facet, but it is tricky with different scales. Another approach is composing two graphs. There is some extra fiddling that has to be done to make sure the plots have the same x-range, to make the labels all fit in the lower plot, and to eliminate the x axis in the upper plot.

    xrange = range(c(economics$date, events$start))
    
    p1 <- ggplot(data=economics, mapping=aes(x=date, y=unemploy)) +
        geom_line(size=3, alpha=0.5) +
        scale_x_date("", limits=xrange) +  
        scale_y_continuous(name="unemployed [1000's]") +
        opts(axis.text.x = theme_blank(), axis.title.x = theme_blank())
    
    ylims <- c(0, (max(events$offset)+1)*delta) + baseline
    p2 <- ggplot(data = events, mapping=aes(x=start)) +
        geom_segment(mapping=aes(y=ymin, xend=start, yend=ymax)) +
        geom_point(mapping=aes(y=ymax), size=3) +
        geom_text(mapping=aes(y=ymax, label=name), hjust=-0.1, vjust=0.1, size=6) +
        scale_x_date("time", limits=xrange) +
        scale_y_continuous("", breaks=NA, limits=ylims)
    
    #install.packages("ggExtra", repos="http://R-Forge.R-project.org")
    library(ggExtra)
    
    align.plots(p1, p2, heights=c(3,1))
    

提交回复
热议问题