问题
I have multiple columns of data, let's say x and time. I want to make a histogram of column x, and color each bar based off an aggregation of the values in column time, where the aggregation is grouped by the breaks used for the histogram. So,
d = cbind(c(rep(1,3), rep(2,3)), c(10,20,10,20,10,20))
names(d) = c("x", "time")
hist(d[,"x"])
Gives me a nice barplot, and let's say I want something like this for my colors:
palette(rainbow(25))
hist(d[,"x"], col=d[,"time"], n=10)
I would like to have the col be a vector of length 10 that is an aggregated function (such as mean) of the time column.
回答1:
I would do this with plyr and ggplot2:
require(plyr)
require(ggplot2)
d <- data.frame(x=c(rep(1:4, each=4)), time=sample(10:100, 16, replace=T))
d <- ddply(d, .(x), transform, mean.time=mean(time))
ggplot(d, aes(x=x, group=x, fill=mean.time)) +
geom_histogram()
回答2:
If I correctly understood, you would like to average time values over each x and plot a histogram. But which colour do you want to use? Gradient or individual, based on mean time values or on x values?
Consider this example as a starting point
require(ggplot2)
d <- data.frame(x=c(rep(1:4, each=4)), time=sample(10:100, 16, replace=T)) # thanks to Andy :)
ggplot(d, aes(x=factor(x), y=time)) +
stat_summary(fun.y="mean", geom="bar", aes(fill=factor(d$x)))
or
ggplot(d, aes(x=factor(x), y=time)) +
stat_summary(fun.y="mean", geom="bar", aes(fill=d$x))
来源:https://stackoverflow.com/questions/11872738/use-histogram-breaks-to-apply-function-over-second-column