Replace NA with previous and next rows mean in R

前端 未结 3 1085
余生分开走
余生分开走 2020-12-06 05:41

How could I Replace a NA with mean of its previous and next rows in a fast manner?

  name grade
1    A    56
2    B    NA
3    C    70
4    D    96
         


        
3条回答
  •  北荒
    北荒 (楼主)
    2020-12-06 06:38

    An alternative solution, using the median instead of mean, is represented by the na.roughfix function of the randomForest package. As described in the documentation, it works with a data frame or numeric matrix. Specifically, for numeric variables, NAs are replaced with column medians. For factor variables, NAs are replaced with the most frequent levels (breaking ties at random). If object contains no NAs, it is returned unaltered.

    Using the same examples as @Henrik,

    library(randomForest)
    x <- c(56, NA, 70, 96) 
    na.roughfix(x)
    
    #[1] 56 70 70 96
    

    or with a larger matrix:

    y <- matrix(1:50, nrow = 10)
    y[sample(1:length(y), 4, replace = FALSE)] <- NA
    y
    #      [,1] [,2] [,3] [,4] [,5]
    # [1,]    1   11   21   31   41
    # [2,]    2   12   22   32   42
    # [3,]    3   NA   23   33   NA
    # [4,]    4   14   24   34   44
    # [5,]    5   15   25   35   45
    # [6,]    6   16   NA   36   46
    # [7,]    7   17   27   37   47
    # [8,]    8   18   28   38   48
    # [9,]    9   19   29   39   49
    # [10,]   10  20   NA   40   50
    
    na.roughfix(y)
    #      [,1] [,2] [,3] [,4] [,5]
    # [1,]    1   11 21.0   31   41
    # [2,]    2   12 22.0   32   42
    # [3,]    3   16 23.0   33   46
    # [4,]    4   14 24.0   34   44
    # [5,]    5   15 25.0   35   45
    # [6,]    6   16 24.5   36   46
    # [7,]    7   17 27.0   37   47
    # [8,]    8   18 28.0   38   48
    # [9,]    9   19 29.0   39   49
    #[10,]   10   20 24.5   40   50
    

提交回复
热议问题