My question is related to assignment by reference versus copying in data.table
. I want to know if one can delete rows by reference, similar to
Good question. data.table
can't delete rows by reference yet.
data.table
can add and delete columns by reference since it over-allocates the vector of column pointers, as you know. The plan is to do something similar for rows and allow fast insert
and delete
. A row delete would use memmove
in C to budge up the items (in each and every column) after the deleted rows. Deleting a row in the middle of the table would still be quite inefficient compared to a row store database such as SQL, which is more suited for fast insert and delete of rows wherever those rows are in the table. But still, it would be a lot faster than copying a new large object without the deleted rows.
On the other hand, since column vectors would be over-allocated, rows could be inserted (and deleted) at the end, instantly; e.g., a growing time series.
It's filed as an issue: Delete rows by reference.