Boxplot schmoxplot: How to plot means and standard errors conditioned by a factor in R?

后端未结

关注

 5  1172

We all love robust measures like medians and interquartile ranges, but lets face it, in many fields, boxplots almost never show up in published articles, while means and sta

相关标签:

5条回答

深忆病人

2020-12-13 16:13

ggplot produces aesthetically pleasing graphs, but I don't have the gumption to try and publish any ggplot output yet.

Until the day comes, here is how I have been making the aforementioned graphs. I use a graphics package called 'gplots' in order to get the standard error bars (using data I've calculated already). Note that this code provides for two or more factors for each class/category. This requires the data to go in as a matrix and for the "beside=TRUE" command in the "barplot2" function to keep the bars from being stacked.

# Create the data (means) matrix
# Using the matrix accommodates two or more factors for each class

data.m <- matrix(c(75,34,19, 39,90,41), nrow = 2, ncol=3, byrow=TRUE,
               dimnames = list(c("Factor 1", "Factor 2"),
                                c("Class A", "Class B", "Class C")))

# Create the standard error matrix

error.m <- matrix(c(12,10,7, 4,7,3), nrow = 2, ncol = 3, byrow=TRUE)

# Join the data and s.e. matrices into a data frame

data.fr <- data.frame(data.m, error.m) 

# load library {gplots}

library(gplots)

# Plot the bar graph, with standard errors

with(data.fr,
     barplot2(data.m, beside=TRUE, axes=T, las=1, ylim = c(0,120),  
                main=" ", sub=" ", col=c("gray20",0),
                    xlab="Class", ylab="Total amount (Mean +/- s.e.)",
                plot.ci=TRUE, ci.u=data.m+error.m, ci.l=data.m-error.m, ci.lty=1))

# Now, give it a legend:

legend("topright", c("Factor 1", "Factor 2"), fill=c("gray20",0),box.lty=0)

It is pretty plain-Jane, aesthetically, but seems to be what most journals/old professors want to see.

I'd post the graph produced by these example data, but this is my first post on the site. Sorry. One should be able to copy-paste the whole thing (after installing the "gplots" package) without problem.

0 讨论(0)

刺人心

2020-12-13 16:22

The first plot was just covered in a blog post on imachordata.com. (hat tip to David Smith on blog.revolution-computing.com) You can also read the related documentation from Hadley on ggplot2.

Here's the example code:

library(ggplot2)
data(mpg)

#create a data frame with averages and standard deviations
 hwy.avg<-ddply(mpg, c("class", "year"), function(df)
 return(c(hwy.avg=mean(df$hwy), hwy.sd=sd(df$hwy))))

#create the barplot component
 avg.plot<-qplot(class, hwy.avg, fill=factor(year), data=hwy.avg, geom="bar", position="dodge")

#first, define the width of the dodge
dodge <- position_dodge(width=0.9)

#now add the error bars to the plot
avg.plot+geom_linerange(aes(ymax=hwy.avg+hwy.sd, ymin=hwy.avg-hwy.sd), position=dodge)+theme_bw()

It ends up looking like this:

0 讨论(0)

别跟我提以往

2020-12-13 16:23
Means and their standard errors are easily automatically computed using ggplot2. I would recommend using the default pointranges, instead of dynamite plots. You might have to provide the position manually. Here is how:
```
ggplot(mtcars, aes(factor(cyl), hp, color = factor(am))) +
  stat_summary(position = position_dodge(0.5))
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
一个人的身影

2020-12-13 16:29
This question is almost 2 years old now, but as a new R user in an experimental field, this was a big question for me, and this page is prominent in google results. I just discovered an answer I like better than the current set, so I thought I'd add it.

the package sciplot makes the task super easy. It gets the job done in a single command
```
#only necessary to get the MPG dataset from ggplot for direct comparison
library(ggplot2)
data(mpg)
attach(mpg)

#the bargraph.CI function with a couple of parameters to match the ggplot example
#see also lineplot.CI in the same package
library(sciplot)
bargraph.CI(
  class,  #categorical factor for the x-axis
  hwy,    #numerical DV for the y-axis
  year,   #grouping factor
  legend=T, 
  x.leg=19,
  ylab="Highway MPG",
  xlab="Class")
```
produces this very workable graph with mostly default options. Note that the error bars are standard errors by default, but the parameter takes a function, so they can be anything you want!
0 讨论(0)
发布评论:

提交评论
- 加载中...

[愿得一人]

2020-12-13 16:31

Coming a little late to the game, but this solution might be useful for future users. It uses the diamond data.frame loaded with R and takes advantage of stat_summary along with two (super short) custom functions.

require(ggplot2)

# create functions to get the lower and upper bounds of the error bars
stderr <- function(x){sqrt(var(x,na.rm=TRUE)/length(na.omit(x)))}
lowsd <- function(x){return(mean(x)-stderr(x))}
highsd <- function(x){return(mean(x)+stderr(x))}

# create a ggplot
ggplot(diamonds,aes(cut,price,fill=color))+
  # first layer is barplot with means
  stat_summary(fun.y=mean, geom="bar", position="dodge", colour='white')+
  # second layer overlays the error bars using the functions defined above
  stat_summary(fun.y=mean, fun.ymin=lowsd, fun.ymax=highsd, geom="errorbar", position="dodge",color = 'black', size=.5)

0 讨论(0)