How to model a mixture of finite components from different parametric families with JAGS?

喜你入骨 提交于 2020-01-04 07:37:06

问题


Imagine a underlying process that draws a number from a normal distribution with probability $\alpha$ and from a uniform distribution with probability $1 - \alpha$. The observed sequence of numbers generated by this process therefore follows a distribution $f$ that is a mixture of 2 components and mixing weights of $\alpha$ and $1 - \alpha$. How would you model this kind of mixture with JAGS when the observed sequence is the only input, but the parametric families are known?

Example (in R):

set.seed(8361299)
N <- 100
alpha <- 0.3
mu <- 5
max <- 50
# Which component to choose from?
latent_class <- rbinom(N, 1, alpha)
Y <- ifelse(latent_class, runif(N, min=mu, max=max), rnorm(N, mean=mu))

The generated (observed) Y looks like:

With JAGS, it should be possible to obtain the mixing weights, as well as the parameters of the known components?


回答1:


Mixture models of the same parametric distribution are pretty straightforward in JAGS/BUGS, but mixture models with varying parametric responses (like yours) are a little more tricky. One method is to use the 'ones trick' whereby we manually calculate the likelihood of the response (selecting one of the two distributions as specified by the latent part of the model) and fit this to the (fake) response of a Bernoulli trial for each data point. For example:

# Your data generation:
set.seed(8361299)
N <- 100
alpha <- 0.3
mu <- 5
max <- 50
# Which component to choose from?
latent_class <- rbinom(N, 1, alpha)
Y <- ifelse(latent_class, runif(N, min=mu, max=max), rnorm(N, mean=mu))

# The model:
model <- "model{

    for(i in 1:N){

        # Log density for the normal part:
        ld_norm[i] <- logdensity.norm(Y[i], mu, tau)

        # Log density for the uniform part:
        ld_unif[i] <- logdensity.unif(Y[i], lower, upper)

        # Select one of these two densities:
        density[i] <- exp(ld_norm[i]*norm_chosen[i] + ld_unif[i]*(1-norm_chosen[i]))

        # Generate a likelihood for the MCMC sampler:
        Ones[i] ~ dbern(density[i])

        # The latent class part as usual:
        norm_chosen[i] ~ dbern(prob)

    }

    # Priors:
    lower ~ dnorm(0, 10^-6)
    upper ~ dnorm(0, 10^-6)
    prob ~ dbeta(1,1)
    mu ~ dnorm(0, 10^-6)
    tau ~ dgamma(0.01, 0.01)

    # Specify monitors, data and initial values using runjags:
    #monitor# lower, upper, prob, mu, tau
    #data# N, Y, Ones
    #inits# lower, upper
}"

# Run the model using runjags (or use rjags if you prefer!)
library('runjags')

lower <- min(Y)-10
upper <- max(Y)+10

Ones <- rep(1,N)

results <- run.jags(model, sample=20000, thin=1)
results

plot(results)

This seems to recover your parameters pretty well (your alpha is 1-prob), but watch out for autocorrelation (and convergence).

Matt


EDIT: Since you asked about generalising to more than 2 distributions, here is equivalent (but more generalisable) code:

# The model:
model <- "model{
    for(i in 1:N){
        # Log density for the normal part:
        ld_comp[i, 1] <- logdensity.norm(Y[i], mu, tau)
        # Log density for the uniform part:
        ld_comp[i, 2] <- logdensity.unif(Y[i], lower, upper)
        # Select one of these two densities and normalise with a Constant:
        density[i] <- exp(ld_comp[i, component_chosen[i]] - Constant)
        # Generate a likelihood for the MCMC sampler:
        Ones[i] ~ dbern(density[i])
        # The latent class part using dcat:
        component_chosen[i] ~ dcat(probs)
    }
    # Priors for 2 parameters using a dirichlet distribution:
    probs ~ ddirch(c(1,1))
    lower ~ dnorm(0, 10^-6)
    upper ~ dnorm(0, 10^-6)
    mu ~ dnorm(0, 10^-6)
    tau ~ dgamma(0.01, 0.01)
    # Specify monitors, data and initial values using runjags:
    #monitor# lower, upper, probs, mu, tau
    #data# N, Y, Ones, Constant
    #inits# lower, upper, mu, tau
}"

library('runjags')

# Initial values to get the chains started:
lower <- min(Y)-10
upper <- max(Y)+10
mu <- 0
tau <- 0.01

Ones <- rep(1,N)

# The constant needs to be big enough to avoid any densities >1 but also small enough to calculate probabilities for observations of 1:
Constant <- 10

results <- run.jags(model, sample=10000, thin=1)
results

This code will work for as many distributions as you need, but expect exponentially worse autocorrelation with more distributions.



来源:https://stackoverflow.com/questions/36609365/how-to-model-a-mixture-of-finite-components-from-different-parametric-families-w

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!