How can I create a histogram from aggregated data in R?

做~自己de王妃 提交于 2019-11-28 10:06:15

To get this kind of flexibility, you may have to replicate your data. Here is one way of doing it with rep:

n <- 10
dat <- data.frame(
    x = sort(sample(1:50, n)),
    f = sample(1:100, n))
dat

expdat <- dat[rep(1:n, times=dat$f), "x", drop=FALSE]

Now you have your data replicated in the data.frame expdat, allowing you to call hist with different numbers of bins:

par(mfcol=c(1, 2))
hist(expdat$x, breaks=50, col="blue", main="50 bins")
hist(expdat$x, breaks=5, col="blue", main="5 bins")
par(mfcol=c(1, 1))

take a gander at ggplot2.

if you data is in a data.frame called df:

ggplot(df,aes(x=Month,y=Frequency))+geom_bar(stat='identity')

or if you want continuous time:

df$Month<-as.POSIXct(paste(df$Month, '01', sep='-'),format='%Y-%m-%d')
ggplot(df,aes(x=Month,y=Frequency))+geom_bar(stat='identity')

Yea, rep solutions will waste too much memory in most interesting/large cases. The HistogramTools CRAN package includes an efficient PreBinnedHistogram function which creates a base R histogram object directly from a list of bins and breaks as the original question provided.

Another possibility is to scale down your frequency variable by some large factor so that rep doesn't have as much work to do. Then adjust the vertical axis scale of the histogram by that same factor.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!