The Marimekko/Mosaic plot is a nice default plot when both x and y are categorical variables. What is the best way to create these using ggplot?
A first attempt. I'm not sure how to put the factor labels on the axis though.
makeplot_mosaic <- function(data, x, y, ...){
xvar <- deparse(substitute(x))
yvar <- deparse(substitute(y))
mydata <- data[c(xvar, yvar)];
mytable <- table(mydata);
widths <- c(0, cumsum(apply(mytable, 1, sum)));
heights <- apply(mytable, 1, function(x){c(0, cumsum(x/sum(x)))});
alldata <- data.frame();
allnames <- data.frame();
for(i in 1:nrow(mytable)){
for(j in 1:ncol(mytable)){
alldata <- rbind(alldata, c(widths[i], widths[i+1], heights[j, i], heights[j+1, i]));
}
}
colnames(alldata) <- c("xmin", "xmax", "ymin", "ymax")
alldata[[xvar]] <- rep(dimnames(mytable)[[1]],rep(ncol(mytable), nrow(mytable)));
alldata[[yvar]] <- rep(dimnames(mytable)[[2]],nrow(mytable));
ggplot(alldata, aes(xmin=xmin, xmax=xmax, ymin=ymin, ymax=ymax)) +
geom_rect(color="black", aes_string(fill=yvar)) +
xlab(paste(xvar, "(count)")) + ylab(paste(yvar, "(proportion)"));
}
Example:
makeplot_mosaic(mtcars, vs, gear)
