Changing the dataset of a ggplot object

问题

I'm ploting subsets of my data with ggplot2 and I was wondering if I would somehow use all the options already contained in a ggplot object in a subset of the original data. As an example, take this is first plot (code chunk 1):

require(ggplot2)
p <- ggplot(mtcars, aes(mpg, wt, color=factor(cyl))) + geom_point(shape=21, size=4)
print(p)

Now I want to make a second plot with a subset of mtcars, so I would normally do this (code chunk 2):

new_data <- subset(mtcars, disp > 200)
p <- ggplot(new_data, aes(mpg, wt, color=factor(cyl))) + geom_point(shape=19, size=4)
print(p)

It seems a little cumbersome to write all the code again for such a small difference in the dataset. Usually in ggplot you can change some parameters (is that the right term?) making the right operations with the p; for example, i can change the plot colors with p + scale_color_manual(values=rainbow(3)). Of course, this is just a silly example, but it gets really tiresome when I have a really detailed plot, with many tweaks everywhere.

So basically, what I would like to know is if there is some way, like a function x such that I can do this:

p + x(data = new_data)

and get the same as with code chunk 2.

Thanks a lot, Juan

回答1:

I think it can be done very easily with the ggplot %+% operator.

p <- ggplot(mtcars, aes(mpg, wt, color=factor(cyl))) + geom_point(shape=21, size=4)
print(p)

p2<-p %+% mtcars[mtcars$disp>200,]
print(p2)

回答2:

If you only want the chunk2:

ggplot(mtcars[mtcars$disp>200,], aes(mpg, wt, color=factor(cyl)))+
geom_point(shape=19, size=4)

If you want both of them in one plot:

ggplot() + 
geom_point(data=mtcars, aes(mpg, wt, color=factor(cyl)),shape=21, size=4)+
geom_point(data=mtcars[mtcars$disp>200,], aes(mpg, wt, color=factor(cyl)),shape=19, size=4)

回答3:

If the question is re-phrased as, "How can I avoid writing repeated code to make similar plots with different data?", one answer is to use functions that apply to a ggplot object:

my_plot <- function (p) {
    p + aes(color=factor(cyl)) + geom_point(shape=21, size=4)
}

p1 <- ggplot(mtcars, aes(mpg, wt))
p2 <- ggplot(newdata, aes(mpg, wt))

p1 <- my_plot(p1); print(p1)
p2 <- my_plot(p2); print(p2)

This stuffs all the shared plot parameters in one place, making the code clearer and easing maintenance. The code still runs for each plot object, of course.

You can of course arbitrarily complicate things by combining functions (a recent example from my own work):

p2 <- by_gene(stacked_bars(my_plot(p2)))

来源：https://stackoverflow.com/questions/29336964/changing-the-dataset-of-a-ggplot-object

标签

graphics

plot

ggplot2