How do I best simulate an arbitrary univariate random variate using its probability function?

后端 未结 4 477
温柔的废话
温柔的废话 2020-12-02 17:54

In R, what\'s the best way to simulate an arbitrary univariate random variate if only its probability density function is available?

相关标签:
4条回答
  • 2020-12-02 18:17

    To clarify the "use Metropolis-Hastings" answer above:

    suppose ddist() is your probability density function

    something like:

    n <- 10000
    cand.sd <- 0.1
    init <- 0
    vals <- numeric(n)
    vals[1] <- init 
    oldprob <- 0
    for (i in 2:n) {
        newval <- rnorm(1,mean=vals[i-1],sd=cand.sd)
        newprob <- ddist(newval)
        if (runif(1)<newprob/oldprob) {
            vals[i] <- newval
        } else vals[i] <- vals[i-1]
       oldprob <- newprob
    }
    

    Notes:

    1. completely untested
    2. efficiency depends on candidate distribution (i.e. value of cand.sd). For maximum efficiency, tune cand.sd to an acceptance rate of 25-40%
    3. results will be autocorrelated ... (although I guess you could always sample() the results to scramble them, or thin)
    4. may need to discard a "burn-in", if your starting value is weird

    The classical approach to this problem is rejection sampling (see e.g. Press et al Numerical Recipes)

    0 讨论(0)
  • 2020-12-02 18:18

    Use cumulative distribution function http://en.wikipedia.org/wiki/Cumulative_distribution_function

    Then just use its inverse. Check here for better picture http://en.wikipedia.org/wiki/Normal_distribution

    That mean: pick random number from [0,1] and set as CDF, then check Value

    It is also called quantile function.

    0 讨论(0)
  • 2020-12-02 18:20

    This is a comment but I don't have enough reputation to drop a comment to Ben Bolker's answer.

    I am new to Metropolis, but IMHO this code is wrong because:

    a) the newval is drawn from a normal distribution whereas in other codes it is drawn from a uniform distribution; this value must be drawn from the range covered by the random number. For example, for a gaussian distribution this should be something like runif(1, -5, +5).

    b) the prob value must be updated only if acceptance.

    Hope this help and hope that someone with reputation could correct this answer (especially mine if I am wrong).

    # the distribution 
    ddist <- dnorm
    # number of random number
    n <- 100000
    # the center of the range is taken as init
    init <- 0
    # the following should go into a function
    vals <- numeric(n)
    vals[1] <- init 
    oldprob <- 0
    for (i in 2:n) {
      newval <- runif(1, -5, +5)
      newprob <- ddist(newval)
      if (runif(1) < newprob/oldprob) {
        vals[i] <- newval
        oldprob <- newprob
      } else vals[i] <- vals[i-1]
    }
    # Final view
    hist(vals, breaks = 100)
    # and comparison
    hist(rnorm(length(vals)), breaks = 100)
    
    0 讨论(0)
  • 2020-12-02 18:24

    Here is a (slow) implementation of the inverse cdf method when you are only given a density.

    den<-dnorm #replace with your own density
    
    #calculates the cdf by numerical integration
    cdf<-function(x) integrate(den,-Inf,x)[[1]]
    
    #inverts the cdf
    inverse.cdf<-function(x,cdf,starting.value=0){
     lower.found<-FALSE
     lower<-starting.value
     while(!lower.found){
      if(cdf(lower)>=(x-.000001))
       lower<-lower-(lower-starting.value)^2-1
      else
       lower.found<-TRUE
     }
     upper.found<-FALSE
     upper<-starting.value
     while(!upper.found){
      if(cdf(upper)<=(x+.000001))
       upper<-upper+(upper-starting.value)^2+1
      else
       upper.found<-TRUE
     }
     uniroot(function(y) cdf(y)-x,c(lower,upper))$root
    }
    
    #generates 1000 random variables of distribution 'den'
    vars<-apply(matrix(runif(1000)),1,function(x) inverse.cdf(x,cdf))
    hist(vars)
    
    0 讨论(0)
提交回复
热议问题