How would you plot a box plot and specific points on the same plot?

时光怂恿深爱的人放手 提交于 2019-12-06 04:45:56

问题


We can draw box plot as below:

qplot(factor(cyl), mpg, data = mtcars, geom = "boxplot")

and point as:

qplot(factor(cyl), mpg, data = mtcars, geom = "point") 

How would you combine both - but just to show a few specific points(say when wt is less than 2) on top of the box?


回答1:


Use + geom_point(...) on your qplot (just add a + geom_point() to get all the points plotted).

To plot selectively just select those points that you want to plot:

n <- nrow(mtcars)
# plot every second point
idx <- seq(1,n,by=2)

qplot( factor(cyl), mpg, data=mtcars, geom="boxplot" ) +
     geom_point( aes(x=factor(cyl)[idx],y=mpg[idx]) )    # <-- see [idx] ?

If you know the points before-hand, you can feed them in directly e.g.:

qplot( factor(cyl), mpg, data=mtcars, geom="boxplot" ) +
     geom_point( aes(x=factor(c(4,6,8)),y=c(15,20,25)) ) # plot (4,15),(6,20),...



回答2:


If you are trying to plot two geoms with two different datasets (boxplot for mtcars, points for a data.frame of literal values), this is a way to do it that makes your intent clear. This works with the current (Sep 2016) version of ggplot (ggplot2_2.1.0)

library(ggplot2)
ggplot() +
  # box plot of mtcars (mpg vs cyl)
  geom_boxplot(data = mtcars, 
               aes(x = factor(cyl), y= mpg)) +
  # points of data.frame literal
  geom_point(data = data.frame(x = factor(c(4,6,8)), y = c(15,20,25)),
             aes(x=x, y=y),
             color = 'red')

I threw in a color = 'red' for the set of points, so it's easy to distinguish them from the points generated as part of geom_boxplot




回答3:


You can show both by using ggplot() rather than qplot(). The syntax may be a little harder to understand, but you can usually get much more done. If you want to plot both the box plot and the points you can write:

boxpt <- ggplot(data = mtcars, aes(factor(cyl), mpg)) 
boxpt + geom_boxplot(aes(factor(cyl), mpg)) + geom_point(aes(factor(cyl), mpg))

I don't know what you mean by only plotting specific points on top of the box, but if you want a cheap (and probably not very smart) way of just showing points above the edge of the box, here it is:

boxpt + geom_boxplot(aes(factor(cyl), mpg)) + geom_point(data = ddply(mtcars, .(cyl),summarise, mpg = mpg[mpg > quantile(mpg, 0.75)]), aes(factor(cyl), mpg))

Basically it's the same thing except for the data supplied to geom_point is adjusted to include only the mpg numbers in the top quarter of the distribution by cylinder. In general I'm not sure this is good practice because I think people expect to see points beyond the whiskers only, but there you go.



来源:https://stackoverflow.com/questions/9255739/how-would-you-plot-a-box-plot-and-specific-points-on-the-same-plot

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!