What exactly does complete in mice do?

╄→尐↘猪︶ㄣ 提交于 2021-02-10 17:51:40

问题


I am researching how to use multiple imputation results. The following is my understanding, and please let me know if there're mistakes.

Suppose you have a data set with missing values, and you want to conduct a regression analysis. You may perform multiple imputation for m = 5 times, and for each imputed data set (5 imputed data sets now) you run a regression analysis, then "pool" the coefficient estimates from these m = 5 models via Rubin's rules (or use R package "pool").

My question is that, in mice you have a function complete(), and the manual says you can extract completed data set by using complete(object).

But if I use mice for m = 5 times, does it still make sense to use complete()? Which imputation results will complete() get for me?

Also, does it make sense if I only use mice with m = 1? Thank you.


回答1:


You probably overlooked that mice::complete() in arguments uses action=1 as default, which "returns the first imputed data set" (see ?mice::complete) and actually is worthless.

You should definitely use action="long" to take account for the "multiplicity" of the multiple imputation!

No, it makes no sense at all to use m=1 (apart from debugging), because every imputation is based on a random process and you have to pool the results (using any method whatsoever) to account for the variation. Often m>20 is recommended1.

Basically, multiple imputation works as follows:

  1. Create m imputation processes with a random component, to obtain
  2. m slightly different imputed data sets.
  3. Analyze each imputed data set to get slightly different parameter estimates.
  4. Combine results, calculating the variation in parameter estimates.

(Also see multiple-imputation-in-a-nutshell for a brief overview.)




回答2:


When you use mice, you get an object that is not the imputed data set. You cannot perform operations on it directly without using the special functions in mice. If you want to extract that actual imputed datasets, you use complete, the output of which is a data.frame with one row per individual per imputation (if using the "long" format). If you are doing any analysis with your imputed data that cannot be performed within mice, you need to create this dataset first.



来源:https://stackoverflow.com/questions/51370292/what-exactly-does-complete-in-mice-do

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!