R: data.table count !NA per row

前端 未结 2 759
予麋鹿
予麋鹿 2020-12-16 14:24

I am trying to count the number of columns that do not contain NA for each row, and place that value into a new column for that row.

Example data:

li         


        
2条回答
  •  一个人的身影
    2020-12-16 14:38

    Try this one using Reduce to chain together + calls:

    d[, num_obs := Reduce(`+`, lapply(.SD,function(x) !is.na(x)))]
    

    If speed is critical, you can eek out a touch more with Ananda's suggestion to hardcode the number of columns being assessed:

    d[, num_obs := 4 - Reduce("+", lapply(.SD, is.na))]
    

    Benchmarking using Ananda's larger data.table d from above:

    fun1 <- function(indt) indt[, num_obs := rowSums(!is.na(indt))][]
    fun3 <- function(indt) indt[, num_obs := Reduce(`+`, lapply(.SD,function(x) !is.na(x)))][]
    fun4 <- function(indt) indt[, num_obs := 4 - Reduce("+", lapply(.SD, is.na))][]
    
    library(microbenchmark)
    microbenchmark(fun1(copy(d)), fun3(copy(d)), fun4(copy(d)), times=10L)
    
    #Unit: milliseconds
    #          expr      min       lq     mean   median       uq      max neval
    # fun1(copy(d)) 3.565866 3.639361 3.912554 3.703091 4.023724 4.596130    10
    # fun3(copy(d)) 2.543878 2.611745 2.973861 2.664550 3.657239 4.011475    10
    # fun4(copy(d)) 2.265786 2.293927 2.798597 2.345242 3.385437 4.128339    10
    

提交回复
热议问题