library(ggplot2)
data = diamonds[, c(\'carat\', \'color\')]
data = data[data$color %in% c(\'D\', \'E\'), ]
I would like to compare the histogram of
It seems that binning the data outside of ggplot2 is the way to go. But I would still be interested to see if there is a way to do it with ggplot2.
library(dplyr)
breaks = seq(0,4,0.5)
data$carat_cut = cut(data$carat, breaks = breaks)
data_cut = data %>%
group_by(color, carat_cut) %>%
summarise (n = n()) %>%
mutate(freq = n / sum(n))
ggplot(data=data_cut, aes(x = carat_cut, y=freq*100, fill=color)) + geom_bar(stat="identity",position="dodge") + scale_x_discrete(labels = breaks) + ylab("Percentage") +xlab("Carat")
You can scale them by group by using the ..group..
special variable to subset the ..count..
vector. It is pretty ugly because of all the dots, but here it goes
ggplot(data, aes(carat, fill=color)) +
geom_histogram(aes(y=c(..count..[..group..==1]/sum(..count..[..group..==1]),
..count..[..group..==2]/sum(..count..[..group..==2]))*100),
position='dodge', binwidth=0.5) +
ylab("Percentage") + xlab("Carat")