I am creating boxplots using ggplot and would like to represent the sample size contributing to each box. In the base plot function there is the varwidth
The current versions of ggplot2 (V 2.1.0) now contains a varwidth option:
data <- data.frame(rbind(cbind(rnorm(700, 0,10), rep("1",700)),
cbind(rnorm(50, 0,10), rep("2",50))))
data$X1 <- as.numeric(as.character(data$X1))
ggplot(data = data, aes(x = X2, y = X1)) +
geom_boxplot(varwidth = TRUE)
Not elegant but you can do that by:
data <- data.frame(rbind(cbind(rnorm(700, 0,10), rep("1",700)),
cbind(rnorm(50, 0,10), rep("2",50))))
data[ ,1] <- as.numeric(as.character(data[,1]))
w <- sqrt(table(data$X2)/nrow(data))
ggplot(NULL, aes(factor(X2), X1)) +
geom_boxplot(width = w[1], data = subset(data, X2 == 1)) +
geom_boxplot(width = w[2], data = subset(data, X2 == 2))

If you have several levels for X2, then you can do without hardcoding all levels:
ggplot(NULL, aes(factor(X2), X1)) +
llply(unique(data$X2), function(i) geom_boxplot(width = w[i], data = subset(data, X2 == i)))
Also you can post a feature request: https://github.com/hadley/ggplot2/issues