How to delete a row by reference in data.table?

前端 未结 6 922
南方客
南方客 2020-11-22 16:07

My question is related to assignment by reference versus copying in data.table. I want to know if one can delete rows by reference, similar to

         


        
6条回答
  •  庸人自扰
    2020-11-22 16:45

    the approach that i have taken in order to make memory use be similar to in-place deletion is to subset a column at a time and delete. not as fast as a proper C memmove solution, but memory use is all i care about here. something like this:

    DT = data.table(col1 = 1:1e6)
    cols = paste0('col', 2:100)
    for (col in cols){ DT[, (col) := 1:1e6] }
    keep.idxs = sample(1e6, 9e5, FALSE) # keep 90% of entries
    DT.subset = data.table(col1 = DT[['col1']][keep.idxs]) # this is the subsetted table
    for (col in cols){
      DT.subset[, (col) := DT[[col]][keep.idxs]]
      DT[, (col) := NULL] #delete
    }
    

提交回复
热议问题