R MICE imputation failing

匿名 (未验证) 提交于 2019-12-03 00:52:01

问题:

I am really baffled about why my imputation is failing in R's Mice 2.22 package. I am attempting a very simple operation with the following data frame:

> dfn    a b c  d 1  0 1 0  1 2  1 0 0  0 3  0 0 0  0 4 NA 0 0  0 5  0 0 0 NA 

I then use mice in the following way to perform a simple mean imputation:

imp <- mice(dfn, method = "mean", m = 1, maxit =1) filled <- complete(imp) 

However, my completed data looks like this:

> fill  a b c  d 1 0.00 1 0  1 2 1.00 0 0  0 3 0.00 0 0  0 4 0.25 0 0  0 5 0.00 0 0 NA 

Why am I still getting this trailing NA? This is the simplest failing example I could construct, but my real data set is much larger and I am just trying to get a sense of where things are going wrong. Any help would be greatly appreciated!

回答1:

I'm not really sure how accurate this is, but here is an attempt. Even though method="mean" is supposed to impute the unconditional mean, it appears from the documentation that the prdictorMatrix is not being changed accordingly.

Normally, leftover NA occur because the predictors suffer from multicollinearity or because there are too few cases per variable (such that the imputation model cannot be estimated). However, method="mean" shouldn't behave that way.

Here is what I did:

dfn <- read.table(text="a b c  d  0 1 0  1  1 0 0  0  0 0 0  0 NA 0 0  0  0 0 0 NA", header=TRUE)  imp <- mice( dfn, method="mean", predictorMatrix=diag(ncol(dfn)) ) complete(imp)  # 1 0.00 1 0 1.00 # 2 1.00 0 0 0.00 # 3 0.00 0 0 0.00 # 4 0.25 0 0 0.00 # 5 0.00 0 0 0.25 

You can try this using your actual data set, but you should check the results carefully. For example, do:

sapply(dfn, function(x) mean(x,na.rm=TRUE)) 

The means for each variable should be identical to those that have been imputed. Please let me know if this solves your problem.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!