I am trying to find a suitable display to illustrate various properties within and across school classes. For each class there is only 15-30 data points (pupils).
Ri
It seems like the accepted answer doesn't work anymore, since ggplot2 has been updated. After much search on the net I found the following on: http://comments.gmane.org/gmane.comp.lang.r.ggplot2/3616 -Look at Winston Chang's reply-
He calculates the outliers separately using ddply and then plotts them using
geom_dotplot()
having disabled the outlier output on the geom_boxplot():
geom_boxplot(outlier.colour = NA)
Here is the full code from the URL mentioned above:
# This returns a data frame with the outliers only
find_outliers <- function(y, coef = 1.5) {
qs <- c(0, 0.25, 0.5, 0.75, 1)
stats <- as.numeric(quantile(y, qs))
iqr <- diff(stats[c(2, 4)])
outliers <- y < (stats[2] - coef * iqr) | y > (stats[4] + coef * iqr)
return(y[outliers])
}
library(MASS) # Use the birthwt data set from MASS
# Find the outliers for each level of 'smoke'
library(plyr)
outlier_data <- ddply(birthwt, .(smoke), summarise, lwt = find_outliers(lwt))
# This draws an ordinary box plot
ggplot(birthwt, aes(x = factor(smoke), y = lwt)) + geom_boxplot()
# This draws the outliers using geom_dotplot
ggplot(birthwt, aes(x = factor(smoke), y = lwt)) +
geom_boxplot(outlier.colour = NA) +
#also consider:
# geom_jitter(alpha = 0.5, size = 2)+
geom_dotplot(data = outlier_data, binaxis = "y",
stackdir = "center", binwidth = 4)