Imputation in R

前端 未结 3 1576
别跟我提以往
别跟我提以往 2021-02-01 10:58

I am new in R programming language. I just wanted to know is there any way to impute null values of just one column in our dataset. Because all of imputation co

3条回答
  •  南旧
    南旧 (楼主)
    2021-02-01 11:44

    Why not use more sophisticated imputation algorithms, such as mice (Multiple Imputation by Chained Equations)? Below is a code snippet in R you can adapt to your case.

    library(mice)
    
    #get the nhanes dataset
    dat <- mice::nhanes
    
    #impute it with mice
    imp <- mice(mice::nhanes, m = 3, print=F)
    
    imputed_dataset_1<-complete(imp,1)
    
    head(imputed_dataset_1)
    
    #     age  bmi hyp chl
    # 1   1   22.5   1 118
    # 2   2   22.7   1 187
    # 3   1   30.1   1 187
    # 4   3   24.9   1 186
    # 5   1   20.4   1 113
    # 6   3   20.4   1 184
    
    #Now, let's see what methods have been used to impute each column
    meth<-imp$method
    #  age   bmi   hyp   chl
    #"" "pmm" "pmm" "pmm"
    
    #The age column is complete, so, it won't be imputed
    # Columns bmi, hyp and chl are going to be imputed with pmm (predictive mean matching)
    
    #Let's say that we want to impute only the "hyp" column
    #So, we set the methods for the bmi and chl column to ""
    meth[c(2,4)]<-""
    #age   bmi   hyp   chl 
    #""    "" "pmm"    "" 
    
    #Let's run the mice imputation again, this time setting the methods parameter to our modified method
    imp <- mice(mice::nhanes, m = 3, print=F, method = meth)
    
    partly_imputed_dataset_1 <- complete(imp, 3)
    
    head(partly_imputed_dataset_1)
    
    #    age  bmi hyp chl
    # 1   1   NA   1  NA
    # 2   2 22.7   1 187
    # 3   1   NA   1 187
    # 4   3   NA   2  NA
    # 5   1 20.4   1 113
    # 6   3   NA   2 184
    

提交回复
热议问题