问题
Background:
Normally, R gives quantiles for well-known distributions. Out of these quantiles, the lower 2.5% up to the upper 97.5% covers 95% of the area under these distributions.
Question:
Suppose I have a F distribution (df1 = 10, df2 = 90). In R, how can I determine the 95% of the area under this distribution such that this 95% only covers the HIGH DENSITY area, not the 95% that R normally gives (see my R code Below)?
Note: Clearly, the highest density is the "mode" (dashed line in the pic below). So I guess, one must move from "mode" toward the tails.
Here is my R code:
curve(df(x, 10, 90), 0, 3, ylab = 'Density', xlab = 'F value', lwd = 3)
Mode = ( (10 - 2) / 10 ) * ( 90 / (90 + 2) )
abline(v = Mode, lty = 2)
CI = qf( c(.025, .975), 10, 90)
arrows(CI[1], .05, CI[2], .05, code = 3, angle = 90, length = 1.4, col= 'red' )
points(Mode, .05, pch = 21, bg = 'green', cex = 3)
回答1:
Section 25.2 of DBDA2E gives complete R code for determining highest density intervals for distributions specified three ways: as cumulative density functions, as grid approximations, or as samples. For a cumulative density function, the function is called HDIofICDF()
. It's in the utilities script, DBDA2E-utilities.R
at the book's web site (linked above). Here's the code:
HDIofICDF = function( ICDFname , credMass=0.95 , tol=1e-8 , ... ) {
# Arguments:
# ICDFname is R’s name for the inverse cumulative density function
# of the distribution.
# credMass is the desired mass of the HDI region.
# tol is passed to R’s optimize function.
# Return value:
# Highest density interval (HDI) limits in a vector.
# Example of use: For determining HDI of a beta(30,12) distribution, type
# > HDIofICDF( qbeta , shape1 = 30 , shape2 = 12 )
# Notice that the parameters of the ICDFname must be explicitly named;
# e.g., HDIofICDF( qbeta , 30 , 12 ) does not work.
# Adapted and corrected from Greg Snow’s TeachingDemos package.
incredMass = 1.0 - credMass
intervalWidth = function( lowTailPr , ICDFname , credMass , ... ) {
ICDFname( credMass + lowTailPr , ... ) - ICDFname( lowTailPr , ... )
}
optInfo = optimize( intervalWidth , c( 0 , incredMass ) , ICDFname=ICDFname ,
credMass=credMass , tol=tol , ... )
HDIlowTailPr = optInfo$minimum
return( c( ICDFname( HDIlowTailPr , ... ) ,
ICDFname( credMass + HDIlowTailPr , ... ) ) )
}
回答2:
Have you tried this package: https://github.com/robjhyndman/hdrcde ?
Following your exemple:
library(hdrcde)
hdr.den(rf(1000,10,90),prob=95)
You can use various High Density Region and it works well for mulitmodal pdf.
hdr.den(c(rf(1000,10,90),rnorm(1000,4,1)),prob=c(50,75,95))
And
And you can even use it with multivariate distribution to visual 2D high density regions:
hdrs=c(50,75,95)
x=c(rf(1000,10,90),rnorm(1000,4,1))
y=c(rf(1000,5,50),rnorm(1000,7,1) )
par(mfrow=c(1,3))
hdr.den(x,prob=hdrs,xlab="x")
hdr.den(y,prob=hdrs,xlab="y")
hdr.boxplot.2d(x,y,prob=hdrs,shadecol="red",xlab="x",ylab="y")
来源:https://stackoverflow.com/questions/42986446/determining-high-density-region-for-a-distribution-in-r