How to get a barplot with several variables side by side grouped by a factor

前端 未结 3 1800
日久生厌
日久生厌 2020-11-29 00:26

I have a dataset which looks like this one below. I am trying to make a barplot with the grouping variable gender, with all the variables side by side on the x axis (grouped

相关标签:
3条回答
  • 2020-11-29 00:49

    You can use aggregate to calculate the means:

    means<-aggregate(df,by=list(df$gender),mean)
    Group.1      tea     coke     beer    water gender
    1       1 87.70171 27.24834 24.27099 37.24007      1
    2       2 24.73330 25.27344 25.64657 24.34669      2
    

    Get rid of the Group.1 column

    means<-means[,2:length(means)]
    

    Then you have reformat the data to be in long format:

    library(reshape2)
    means.long<-melt(means,id.vars="gender")
      gender variable    value
    1      1      tea 87.70171
    2      2      tea 24.73330
    3      1     coke 27.24834
    4      2     coke 25.27344
    5      1     beer 24.27099
    6      2     beer 25.64657
    7      1    water 37.24007
    8      2    water 24.34669
    

    Finally, you can use ggplot2 to create your plot:

    library(ggplot2)
    ggplot(means.long,aes(x=variable,y=value,fill=factor(gender)))+
      geom_bar(stat="identity",position="dodge")+
      scale_fill_discrete(name="Gender",
                          breaks=c(1, 2),
                          labels=c("Male", "Female"))+
      xlab("Beverage")+ylab("Mean Percentage")
    

    enter image description here

    0 讨论(0)
  • 2020-11-29 00:51

    Using reshape2 and dplyr. Your data:

    df <- read.table(text=
    "tea                coke            beer             water           gender
    14.55              26.50793651     22.53968254      40              1
    24.92997199        24.50980392     26.05042017      24.50980393     2
    23.03732304        30.63063063     25.41827542      20.91377091     1   
    225.51781276       24.6064623      24.85501243      50.80645161     1
    24.53662842        26.03706973     25.24271845      24.18358341     2", header=TRUE)
    

    Getting data into correct form:

    library(reshape2)
    library(dplyr)
    df.melt <- melt(df, id="gender")
    bar <- group_by(df.melt, variable, gender)%.%summarise(mean=mean(value))
    

    Plotting:

    library(ggplot2)
    ggplot(bar, aes(x=variable, y=mean, fill=factor(gender)))+
      geom_bar(position="dodge", stat="identity")
    

    enter image description here

    0 讨论(0)
  • 2020-11-29 01:04

    You can plot the means without resorting to external calculations and additional tables using stat_summary(...). In fact, stat_summary(...) was designed for exactly what you are doing.

    library(ggplot2)
    library(reshape2)            # for melt(...)
    gg <- melt(df,id="gender")   # df is your original table
    ggplot(gg, aes(x=variable, y=value, fill=factor(gender))) + 
      stat_summary(fun.y=mean, geom="bar",position=position_dodge(1)) + 
      scale_color_discrete("Gender")
      stat_summary(fun.ymin=min,fun.ymax=max,geom="errorbar",
                   color="grey80",position=position_dodge(1), width=.2)
    

    To add "error bars" you cna also use stat_summary(...) (here, I'm using the min and max value rather than sd because you have so little data).

    ggplot(gg, aes(x=variable, y=value, fill=factor(gender))) + 
      stat_summary(fun.y=mean, geom="bar",position=position_dodge(1)) + 
      stat_summary(fun.ymin=min,fun.ymax=max,geom="errorbar",
                   color="grey40",position=position_dodge(1), width=.2) +
      scale_fill_discrete("Gender")
    

    0 讨论(0)
提交回复
热议问题