Bar charts connected by lines / How to connect two graphs arranged with grid.arrange in R / ggplot2

前端 未结 2 511
清酒与你
清酒与你 2021-01-03 03:19

At Facebook research, I found these beautiful bar charts which are connected by lines to indicate rank changes:

https://research.fb.com/do-jobs-run-in-families/

2条回答
  •  情书的邮戳
    2021-01-03 03:44

    Here's a pure ggplot2 solution, which combines the underlying data frames into one & plots everything in a single plot:

    Data manipulation:

    library(dplyr)    
    bar.width <- 0.9
    
    # combine the two data sources
    df <- rbind(state1 %>% mutate(source = "state1"),
                state2 %>% mutate(source = "state2")) %>%
    
      # calculate each state's rank within each data source
      group_by(source, state) %>%
      mutate(state.sum = sum(value)) %>%
      ungroup() %>%
      group_by(source) %>%
      mutate(source.rank = as.integer(factor(state.sum))) %>%
      ungroup() %>%
    
      # calculate the dimensions for each bar
      group_by(source, state) %>%
      arrange(type) %>% 
      mutate(xmin = lag(cumsum(value), default = 0),
             xmax = cumsum(value),
             ymin = source.rank - bar.width / 2,
             ymax = source.rank + bar.width / 2) %>% 
      ungroup() %>%
    
      # shift each data source's coordinates away from point of origin,
      # in order to create space for plotting lines
      mutate(x = ifelse(source == "state1", -max(xmax) / 2, max(xmax) / 2)) %>%
      mutate(xmin = ifelse(source == "state1", x - xmin, x + xmin),
             xmax = ifelse(source == "state1", x - xmax, x + xmax)) %>%
    
      # calculate label position for each data source
      group_by(source) %>%
      mutate(label.x = max(abs(xmax))) %>%
      ungroup() %>%
      mutate(label.x = ifelse(source == "state1", -label.x, label.x),
             hjust = ifelse(source == "state1", 1.1, -0.1))
    

    Plot:

    ggplot(df, 
           aes(x = x, y = source.rank,
               xmin = xmin, xmax = xmax, 
               ymin = ymin, ymax = ymax,
               fill = type)) +
      geom_rect() +
      geom_line(aes(group = state)) +
      geom_text(aes(x = label.x, label = state, hjust = hjust),
                check_overlap = TRUE) +
    
      # allow some space for the labels; this may be changed
      # depending on plot dimensions
      scale_x_continuous(expand = c(0.2, 0)) +
      scale_fill_manual(values = fill) +
    
      theme_void() +
      theme(legend.position = "top")
    

    Data source (same as @camille's):

    set.seed(1017)
    
    state1 <- data_frame(
      state = rep(state.name[1:5], each = 3),
      value = floor(runif(15, 1, 100)),
      type = rep(c("state", "local", "fed"), times = 5)
    )
    
    state2 <- data_frame(
      state = rep(state.name[1:5], each = 3),
      value = floor(runif(15, 1, 100)),
      type = rep(c("state", "local", "fed"), times = 5)
    )
    

提交回复
热议问题