MLE error in R: initial value in 'vmmin' is not finite

匿名 (未验证) 提交于 2019-12-03 01:10:02

问题:

Suppose I have 2 data.frame objects:

df1 <- data.frame(x = 1:100) df1$y <- 20 + 0.3 * df1$x + rnorm(100) df2 <- data.frame(x = 1:200000) df2$y <- 20 + 0.3 * df2$x + rnorm(200000) 

I want to do MLE. With df1 everything is ok:

LL1 <- function(a, b, mu, sigma) {     R = dnorm(df1$y - a- b * df1$x, mu, sigma)      -sum(log(R)) } library(stats4) mle1 <- mle(LL1, start = list(a = 20, b = 0.3,  sigma=0.5),         fixed = list(mu = 0))  > mle1 Call: mle(minuslogl = LL1, start = list(a = 20, b = 0.3, sigma = 0.5),  fixed = list(mu = 0))  Coefficients:       a           b          mu       sigma  23.89704180  0.07408898  0.00000000  3.91681382  

But if I would do the same task with df2 I would receive an error:

LL2 <- function(a, b, mu, sigma) {     R = dnorm(df2$y - a- b * df2$x, mu, sigma)      -sum(log(R)) } mle2 <- mle(LL2, start = list(a = 20, b = 0.3,  sigma=0.5),               fixed = list(mu = 0)) Error in optim(start, f, method = method, hessian = TRUE, ...) :    initial value in 'vmmin' is not finite 

How can I overcome it?

回答1:

The value of R becomes zero at some point; it leads to a non-finite value of the function to be minimized and returns an error.

Using the argument log=TRUE handles better this issue, see function LL3 below. The following gives some warnings but a result is returned, with parameter estimates close to the true parameters.

require(stats4) set.seed(123) e <- rnorm(200000) x <- 1:200000 df3 <- data.frame(x) df3$y <- 20 + 0.3 * df3$x + e LL3 <- function(a, b, mu, sigma) {   -sum(dnorm(df3$y - a- b * df3$x, mu, sigma, log=TRUE)) } mle3 <- mle(LL3, start = list(a = 20, b = 0.3,  sigma=0.5),   fixed = list(mu = 0)) Warning messages: 1: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced 2: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced 3: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced 4: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced 5: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced 6: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced 7: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced 8: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced  > mle3 Call: mle(minuslogl = LL3, start = list(a = 20, b = 0.3, sigma = 0.5),      fixed = list(mu = 0))  Coefficients:         a         b        mu     sigma  19.999166  0.300000  0.000000  1.001803  


回答2:

I had the same problem when minimizin a log-likelihood function. After some debugging I found that the problem was in my starting values. They caused one specific matrix to have a determinant = 0, which caused an error when a log was taken of it. Therefore, it could not find any "finite" value, but that was because the function returned an error to optim.

Bottomline: consider if your function is not returning an error when you run it using the starting values.

PS.: Marius Hofert is completely right. Never suppress warnings.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!