问题
I have got a Gaussian mixture distribution object obj
of 64 dimensions and would like to put it in the pdf
function to find out the probability of certain point.
Yet when I type pdf(obj,obj.mu(1,:))
to test the object it yield a very high probability (like 2.4845e+069)
And it does not make sense, cause probability should lies between zero and one.
Is my matlab having any problem?
p.s.
even pdf(obj,obj.mu(1,:)+obj.Sigma(1,1)*rand())
yield a high probability (2.1682e+069)
回答1:
First things first: a probability density function does not always evaluate to 1, it merely integrates to 1 over its domain.
Moreover, what you are seeing is the problem of singularities (see page 434, figure 9.7) when fitting a gaussian mixture model. Some component collapsing onto a single data point inevitably causes the variance to go to 0 and the PDF to explode. This is often encountered in gaussian mixture models because it is not log-convex and there are lots of local maxima in the likelihood function. We try to find a well-behaved local maximum that performs well, and the singularities are particularly bad cases.
When you see this, you will want to rerun the algorithm with different starting points or to reduce the number of components you are using. The book above also recommends just resetting the particular component to a different value.
Another approach would be to use a Bayesian approach by adopting a prior or regularization term for your parameters, which will penalize outlandish values such as 0 sigma parameters.
You can indirectly control the first part using different starting values in gmdistribution.fit
. For the second part, you can use the Regularize
argument: http://www.mathworks.com/help/stats/gmdistribution.fit.html
来源:https://stackoverflow.com/questions/15541575/using-matlab-function-pdf