Fastest way to replace NAs in a large data.table

后端 未结 10 1043
走了就别回头了
走了就别回头了 2020-11-22 17:10

I have a large data.table, with many missing values scattered throughout its ~200k rows and 200 columns. I would like to re code those NA values to zeros as efficiently as

10条回答
  •  天涯浪人
    2020-11-22 17:23

    To generalize to many columns you could use this approach (using previous sample data but adding a column):

    z = data.table(x = sample(c(NA_integer_, 1), 2e7, TRUE), y = sample(c(NA_integer_, 1), 2e7, TRUE))
    
    z[, names(z) := lapply(.SD, function(x) fifelse(is.na(x), 0, x))]
    

    Didn't test for the speed though

提交回复
热议问题