Plotting a bivariate to multiple factors in R

后端 未结 3 2184
终归单人心
终归单人心 2021-01-06 12:14

First of all, I\'m still a beginner. I\'m trying to interpret and draw a stack bar plot with R. I already took a look at a number of answers but some were not specific to my

3条回答
  •  无人及你
    2021-01-06 13:10

    Here is one possibility which starts with the 'un-tabulated' data frame, melt it, plot it with geom_bar in ggplot2 (which does the counting per group), separate the plot by variable by using facet_wrap.

    Create toy data:

    set.seed(123)
    df <- data.frame(Variant = sample(c("iedere", "elke"), size = 50, replace = TRUE),
               Region = sample(c("VL", "NL"), size = 50, replace = TRUE),
               PrecededByPrep = sample(c("1", "0"), size = 50, replace = TRUE),
               Person = sample(c("person", "no person"), size = 50, replace = TRUE),
               Time = sample(c("time", "no time"), size = 50, replace = TRUE))
    

    Reshape data:

    library(reshape2)
    df2 <- melt(df, id.vars = "Variant")
    

    Plot:

    library(ggplot2)
    ggplot(data = df2, aes(factor(value), fill = Variant)) +
      geom_bar() +
      facet_wrap(~variable, nrow = 1, scales = "free_x") +
      scale_fill_grey(start = 0.5) +
      theme_bw()
    

    enter image description here

    There are lots of opportunities to customize the plot, such as setting order of factor levels, rotating axis labels, wrapping facet labels on two lines (e.g. for the longer variable name "PrecededByPrep"), or changing spacing between facets.

    Customization (following updates in question and comments by OP)

    # labeller function used in facet_grid to wrap "PrecededByPrep" on two lines
    # see http://www.cookbook-r.com/Graphs/Facets_%28ggplot2%29/#modifying-facet-label-text
    my_lab <- function(var, value){
      value <- as.character(value)
        if (var == "variable") { 
          ifelse(value == "PrecededByPrep", "Preceded\nByPrep", value)
        }
    }
    
    ggplot(data = df2, aes(factor(value), fill = Variant)) +
      geom_bar() +
      facet_grid(~variable, scales = "free_x", labeller = my_lab) + 
      scale_fill_manual(values = c("paleturquoise3", "palegreen3")) + # manual fill colors
      theme_bw() +
      theme(axis.text = element_text(face = "bold"), # axis tick labels bold 
            axis.text.x = element_text(angle = 45, hjust = 1), # rotate x axis labels
            line = element_line(colour = "gray25"), # line colour gray25 = #404040
            strip.text = element_text(face = "bold")) + # facet labels bold  
      xlab("factors") + # set axis labels
      ylab("frequency")
    

    enter image description here

    Add counts to each bar (edit following comments from OP).

    The basic principles to calculate the y coordinates can be found in this Q&A. Here I use dplyr to calculate counts per bar (i.e. label in geom_text) and their y coordinates, but this could of course be done in base R, plyr or data.table.

    # calculate counts (i.e. labels for geom_text) and their y positions.
    library(dplyr)
    df3 <- df2 %>%
      group_by(variable, value, Variant) %>%
      summarise(n = n()) %>%
      mutate(y = cumsum(n) - (0.5 * n))
    
    # plot
    ggplot(data = df2, aes(x = factor(value), fill = Variant)) +
      geom_bar() +
      geom_text(data = df3, aes(y = y, label = n)) +
      facet_grid(~variable, scales = "free_x", labeller = my_lab) + 
      scale_fill_manual(values = c("paleturquoise3", "palegreen3")) + # manual fill colors
      theme_bw() +
      theme(axis.text = element_text(face = "bold"), # axis tick labels bold 
            axis.text.x = element_text(angle = 45, hjust = 1), # rotate x axis labels
            line = element_line(colour = "gray25"), # line colour gray25 = #404040
            strip.text = element_text(face = "bold")) + # facet labels bold  
      xlab("factors") + # set axis labels
      ylab("frequency")
    

    enter image description here

提交回复
热议问题