R - ggplot2 'dodge' geom_step() to overlap geom_bar()

本小妞迷上赌 提交于 2020-05-26 01:08:48

问题


Plotting counts using ggplot2's geom_bar(stat="identity") is an effective method of visualising counts. I would like to use this method to display my observed counts and compare them to expected counts I would like to do this by using geom_step to overlay a stairstep plot layer over the barplot.

However when I do this I run into the problem that barplots by default have their positions dodged but geom_step does not. For example using both continuous and discrete dependent variables:

library(tidyverse)

test <- data_frame(a = 1:10, b = runif(10, 1, 10))

test_plot <- ggplot(test, aes(a, b)) + 
  geom_bar(stat="identity") + 
  geom_step(color = 'red')

test2 <- data_frame(a = letters[1:10], b = runif(10, 1, 10))

test2_plot <- ggplot(test2, aes(a, b, group = 1)) + 
  geom_bar(stat="identity") + 
  geom_step(color = 'red'))

gridExtra::grid.arrange(test_plot, test2_plot, ncol = 2)

As you can see the two layers are offset which is undesirable.

Reading the docs I see that geom_path has a position = option however trying something like geom_step(color = 'red', position = position_dodge(width = 0.5)) does not do what I want rather it compresses the bars and the stairstep line towards the centre. Another option is to adjust the data directly like this geom_step(aes(a-0.5, b), color = 'red') which produces a near acceptable result for data with continuous dependent variables. You could also calculate the stairstep line as a function and plot it using stat_function().

However these approaches are not applicable to data with discrete dependent variables and my actual data has discrete dependent variables so I need another answer.

Additionally when shifted the stairstep line will not cover the last bar as seen in the above image. Is there an easy elegant way to extend it to cover the last bar?

If geom_step() is the wrong approach and what I'm trying to get can be achieved in another way I am interested in that too.


回答1:


I think the most efficient way to solve this problem is to define custom geom in the following way:

library(tidyverse)

geom_step_extend <- function(data, extend = 1, nudge = -0.5,
                             ...) {
  # Function for computing the last segment data
  get_step_extend_data <- function(data, extend = 1, nudge = -0.5) {
    data_out <- as.data.frame(data[order(data[[1]]), ])
    n <- nrow(data)
    max_x_y <- data_out[n, 2]
    if (is.numeric(data_out[[1]])) {
      max_x <- data_out[n, 1] + nudge
    } else {
      max_x <- n + nudge
    }

    data.frame(x = max_x,
               y = max_x_y,
               xend = max_x + extend,
               yend = max_x_y)
  }

  # The resulting geom
  list(
    geom_step(position = position_nudge(x = nudge), ...),
    geom_segment(
      data = get_step_extend_data(data, extend = extend, nudge = nudge),
      mapping = aes(x = x, y = y,
                    xend = xend, yend = yend),
      ...
    )
  )
}

set.seed(111)
test <- data_frame(a = 1:10, b = runif(10, 1, 10))
test2 <- data_frame(a = letters[1:10], b = runif(10, 1, 10))

test_plot <- ggplot(test, aes(a, b, group = 1)) + 
  geom_bar(stat = "identity") + 
  geom_step_extend(data = test, colour = "red")

test2_plot <- ggplot(test2, aes(a, b, group = 1)) + 
  geom_bar(stat = "identity") + 
  geom_step_extend(data = test2, colour = "red")

gridExtra::grid.arrange(test_plot, test2_plot, ncol = 2)

Basically this solution consists from three parts:

  1. Nudge to the left with position_nudge the step curve by desired value (in this case -0.5);
  2. Compute the absent (the one on the right) segment data with function get_step_extend_data. Its behaviour is inspired from ggplot2:::stairstep which is the underlying function of geom_step;
  3. Compose geom_step with geom_segment in separate geom with list.



回答2:


Here's a rather crude solution, but should work in this case.

Create an alternate data frame that expanded each line to extend the x-axis by -0.5 and 0.5:

test2 <- data.frame(a = lapply(1:nrow(test), function(x) c(test[x,"a"]-.5, test[x,"a"], test[x, "a"]+0.5)) %>% unlist, 
                b = lapply(1:nrow(test), function(x) rep(test[x,"b"], 3)) %>% unlist)

Plot the outline with geom_line argument:

ggplot(test, aes(a,b)) + geom_bar(stat="identity", alpha=.7) + geom_line(data=test2, colour="red")

This will look tidier if you set the geom_bar width to 1:

ggplot(test, aes(a,b)) + geom_bar(width=1, stat="identity", alpha=.7) + geom_line(data=test2, colour="red")




回答3:


Since ggplot2 version 3.3.0 this is option is now supported by geom_step using direction = "mid":

library(tidyverse)

test <- data_frame(a = 1:10, b = runif(10, 1, 10))

test_plot <- ggplot(test, aes(a, b)) + 
  geom_bar(stat="identity") + 
  geom_step(color = 'red', direction = "mid", size = 2)

test_plot



来源:https://stackoverflow.com/questions/43434725/r-ggplot2-dodge-geom-step-to-overlap-geom-bar

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!