Histogram with Logarithmic Scale and custom breaks

天大地大妈咪最大 提交于 2019-11-26 20:14:49

A histogram is a poor-man's density estimate. Note that in your call to hist() using default arguments, you get frequencies not probabilities -- add ,prob=TRUE to the call if you want probabilities.

As for the log axis problem, don't use 'x' if you do not want the x-axis transformed:

plot(mydata_hist$count, log="y", type='h', lwd=10, lend=2)

gets you bars on a log-y scale -- the look-and-feel is still a little different but can probably be tweaked.

Lastly, you can also do hist(log(x), ...) to get a histogram of the log of your data.

Thierry

Another option would be to use the package ggplot2.

ggplot(mydata, aes(x = V3)) + geom_histogram() + scale_x_log10()

It's not entirely clear from your question whether you want a logged x-axis or a logged y-axis. A logged y-axis is not a good idea when using bars because they are anchored at zero, which becomes negative infinity when logged. You can work around this problem by using a frequency polygon or density plot.

Dirk's answer is a great one. If you want an appearance like what hist produces, you can also try this:

buckets <- c(0,1,2,3,4,5,25)
mydata_hist <- hist(mydata$V3, breaks=buckets, plot=FALSE)
bp <- barplot(mydata_hist$count, log="y", col="white", names.arg=buckets)
text(bp, mydata_hist$counts, labels=mydata_hist$counts, pos=1)

The last line is optional, it adds value labels just under the top of each bar. This can be useful for log scale graphs, but can also be omitted.

I also pass main, xlab, and ylab parameters to provide a plot title, x-axis label, and y-axis label.

Run the hist() function without making a graph, log-transform the counts, and then draw the figure.

hist.data = hist(my.data, plot=F)
hist.data$counts = log(hist.data$counts, 2)
plot(hist.data)

It should look just like the regular histogram, but the y-axis will be log2 Frequency.

I've put together a function that behaves identically to hist in the default case, but accepts the log argument. It uses several tricks from other posters, but adds a few of its own. hist(x) and myhist(x) look identical.

The original problem would be solved with:

myhist(mydata$V3, breaks=c(0,1,2,3,4,5,25), log="xy")

The function:

myhist <- function(x, ..., breaks="Sturges",
                   main = paste("Histogram of", xname),
                   xlab = xname,
                   ylab = "Frequency") {
  xname = paste(deparse(substitute(x), 500), collapse="\n")
  h = hist(x, breaks=breaks, plot=FALSE)
  plot(h$breaks, c(NA,h$counts), type='S', main=main,
       xlab=xlab, ylab=ylab, axes=FALSE, ...)
  axis(1)
  axis(2)
  lines(h$breaks, c(h$counts,NA), type='s')
  lines(h$breaks, c(NA,h$counts), type='h')
  lines(h$breaks, c(h$counts,NA), type='h')
  lines(h$breaks, rep(0,length(h$breaks)), type='S')
  invisible(h)
}

Exercise for the reader: Unfortunately, not everything that works with hist works with myhist as it stands. That should be fixable with a bit more effort, though.

Here's a pretty ggplot2 solution:

library(ggplot2)
library(scales)  # makes pretty labels on the x-axis

breaks=c(0,1,2,3,4,5,25)

ggplot(mydata,aes(x = V3)) + 
  geom_histogram(breaks = log10(breaks)) + 
  scale_x_log10(
    breaks = breaks,
    labels = scales::trans_format("log10", scales::math_format(10^.x))
  )

Note that to set the breaks in geom_histogram, they had to be transformed to work with scale_x_log10

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!