Barplot with ggplot 2 of two categorical variable facet_wrap according a third variable displayng percentage

醉酒当歌 提交于 2021-02-07 10:14:17

问题


I would like to barplot in ggplot2 a categorical variable grouped according a second categorical variable and use facet_wrap to divide them in different plots. Than I would show percentage of each. Here a reproducible example

test <- data.frame(
  test1 = sample(letters[1:2], 100, replace = TRUE), 
  test2 = sample(letters[3:5], 100, replace = TRUE),
  test3 = sample(letters[9:11],100, replace = TRUE )
)


ggplot(test, aes(x=factor(test1))) +
  geom_bar(aes(fill=factor(test2), y=..prop.., group=factor(test2)), position="dodge") +
  facet_wrap(~factor(test3))+
  scale_y_continuous("Percentage (%)", limits = c(0, 1), breaks = seq(0, 1, by=0.1), labels = percent)+
  scale_x_discrete("")+
  theme(plot.title = element_text(hjust = 0.5), panel.grid.major.x = element_blank())

This give me a barplot with the percentage of test2 according test1 in each test3. I would like to show the percentage of each bar on the top. Moreover, I would like to change the name of the legend in the right from factor(test2) in Test2.

enter image description here


回答1:


It may be easiest to do the data summary yourself so that you can create a column with the percentage labels you want. (Note that as is, I'm not sure what you want your percentages to show- in facet i, group b, there is a column that is nearly 90%, and two columns that are greater than or equal to 50%- is that intended?)

Libraries and your example data frame:

library(ggplot2)
library(dplyr)

test <- data.frame(
  test1 = sample(letters[1:2], 100, replace = TRUE), 
  test2 = sample(letters[3:5], 100, replace = TRUE),
  test3 = sample(letters[9:11],100, replace = TRUE )
)

First, group by all columns (note the order), then summarize to get the length of test2. Mutate to get a value for the column height and label- here I've multiplied by 100 and rounded.

test.grouped <- test %>%
  group_by(test1, test3, test2) %>%
  summarize(t2.len = length(test2)) %>%
  mutate(t2.prop = round(t2.len / sum(t2.len) * 100, 1))

> test.grouped
# A tibble: 18 x 5
# Groups:   test1, test3 [6]
    test1  test3  test2 t2.len t2.prop
   <fctr> <fctr> <fctr>  <int>   <dbl>
 1      a      i      c      4    30.8
 2      a      i      d      5    38.5
 3      a      i      e      4    30.8
 4      a      j      c      3    20.0
 5      a      j      d      8    53.3
...

Use the summarized data to build your plot, using geom_text to use the proportion column as the label:

ggplot(test.grouped, aes(x = test1, 
                         y = t2.prop, 
                         fill = test2, 
                         group = test2)) +  
  geom_bar(stat = "identity", position = position_dodge(width = 0.9)) +
  geom_text(aes(label = paste(t2.prop, "%", sep = ""), 
                group = test2), 
            position = position_dodge(width = 0.9),
            vjust = -0.8)+
  facet_wrap(~ test3) + 
  scale_y_continuous("Percentage (%)") +
  scale_x_discrete("") + 
  theme(plot.title = element_text(hjust = 0.5), panel.grid.major.x = element_blank())



来源:https://stackoverflow.com/questions/47478138/barplot-with-ggplot-2-of-two-categorical-variable-facet-wrap-according-a-third-v

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!