问题
I am doing a basic boxplot where y=age and x=Patient groups
age <- ggplot(data, aes(factor(group2), age)) + ylim(15, 80)
age + geom_boxplot(fill = \"grey80\", colour = \"#3366FF\")
I was hoping you could help me out with a few things:
1) Is it possible to include a number of observations per group above each group boxplot (but NOT on the X axis where my group labels are) without having to do this in paint :)? I have tried using:
age + annotate(\"text\", x = \"CON\", y = 60, label = \"25\")
where CON is the 1st group and y = 60 is ~ just above the boxplot for this group. However, the command didn\'t work. I assume it has something to do that it reads x as a continuous rather than a categorical variable.
2) Also although there are plenty of questions about using the mean rather than the median for the boxplots, I still haven`t found a code that works for me?
3) On the same matter is there a way you could include the mean group stat in the boxplot? Perhaps using
age + stat_summary(fun.y=mean, colour=\"red\", geom=\"point\")
which however only includes a dot of where the mean lies. Or again using
age + annotate(\"text\", x = \"CON\", y = 30, label = \"30\")
where CON is the 1st group and y = 30 is ~ the group age mean.
Knowing how flexible and rich ggplot2 syntax is I was hoping that there is a more elegant way of using the real stats output rather than annotate.
Any suggestions/links would be much appreciated!
Thanks!!
回答1:
Is this anything like what you're after? With stat_summary, as requested:
# function for number of observations
give.n <- function(x){
return(c(y = median(x)*1.05, label = length(x)))
# experiment with the multiplier to find the perfect position
}
# function for mean labels
mean.n <- function(x){
return(c(y = median(x)*0.97, label = round(mean(x),2)))
# experiment with the multiplier to find the perfect position
}
# plot
ggplot(mtcars, aes(factor(cyl), mpg, label=rownames(mtcars))) +
geom_boxplot(fill = "grey80", colour = "#3366FF") +
stat_summary(fun.data = give.n, geom = "text", fun.y = median) +
stat_summary(fun.data = mean.n, geom = "text", fun.y = mean, colour = "red")
Black number is number of observations, red number is mean value. joran's answer shows you how to put the numbers at the top of the boxes
hat-tip: https://stackoverflow.com/a/3483657/1036500
回答2:
I think this is what you're looking for maybe?
myboxplot <- ddply(mtcars,
.(cyl),
summarise,
min = min(mpg),
q1 = quantile(mpg,0.25),
med = median(mpg),
q3 = quantile(mpg,0.75),
max= max(mpg),
lab = length(cyl))
ggplot(myboxplot, aes(x = factor(cyl))) +
geom_boxplot(aes(lower = q1, upper = q3, middle = med, ymin = min, ymax = max), stat = "identity") +
geom_text(aes(y = max,label = lab),vjust = 0)
I just realized I mistakenly used the median when you were asking about the mean, but you can obviously use whatever function for the middle aesthetic you please.
回答3:
Answer to the first problem.
To show value above the box you should provide x values as numeric not as level names. So, to plot the value above first value give x=1.
data(ToothGrowth)
ggplot(ToothGrowth,aes(supp,len))+geom_boxplot()+
annotate("text",x=1,y=32,label=30)
来源:https://stackoverflow.com/questions/15660829/how-to-add-a-number-of-observations-per-group-and-use-group-mean-in-ggplot2-boxp