How to make variable width histogram in R with labels aligned to bin edges?

眉间皱痕 提交于 2021-02-10 06:13:20

问题


I'm using ggplot2, which by default creates histograms with fixed bin widths and whose bin labels are plotted in the center of each bin.

What I want instead is a variable-width histogram whose bin labels are representative of the end points of each bin, like this plot:

desired plot

To produce this example plot, I manually entered the bin parameters and shifted the bins to align them with their end points:

income=data.frame(lx=c(0,10,25,50,100),rx=c(10,25,50,100,150),y=c(20,28,27,18,7))
income$width = income$rx-income$lx


ggplot(income, aes(lx+width/2,y/width)) + geom_bar(aes(width=rx-lx), color='black', stat='identity') + 
  scale_x_continuous(breaks=unique(c(income$lx,income$rx))) + labs(x='Income (thousands of $)', y='% per thousand $')

But I want to do this automatically, from the original data. (The original data can be approximated using the following code):

incomes=unlist(sapply(1:nrow(income), function(i) sample(income$lx[i]:(income$rx[i]-1),income$y[i],replace=TRUE)))
widths=unlist(sapply(1:nrow(income), function(i) rep(income$rx[i]-income$lx[i],income$y[i])))
incomes=data.frame(incomes, widths)

回答1:


You can produce a variable width histogram by specifying the desired breaks in geom_histogram. Use y=..density.. (rather than the default, which is based on counts), so that the bars will be normalized to their proportion of the total bar area.

breaks = c(0,10,25,50,100,150)

ggplot(incomes, aes(incomes)) +
  geom_histogram(aes(y=..density..),
                 color="black", fill="grey40", breaks=breaks) +
  scale_x_continuous(breaks=breaks)



来源:https://stackoverflow.com/questions/37766893/how-to-make-variable-width-histogram-in-r-with-labels-aligned-to-bin-edges

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!