Subset data table without using <-

你离开我真会死。 提交于 2019-12-04 17:19:00

问题


I want to subset some rows of a data table. Like this:

# load data
  data("mtcars")

# convert to data table
  setDT(mtcars,keep.rownames = T)

# Subset data
  mtcars <- mtcars[like(rn,"Mer"),] # or
  mtcars <- mtcars[mpg > 20,]

However, I'm working with a huge data set and I wanted to avoid using <-, which is not memory efficient because it makes a copy of the data.

Is this correct? Is it possible to update the filtered data without <- ?


回答1:


What you are asking would be delete rows by reference.

It is not yet possible, but there is FR for that #635.

Until then you need to copy (in-memory) your data.table subset, the copy is done by <- (or =) when is combined with subset (i arg) so for now you cannot avoid that.

If it will help somehow you can operate on language objects to predefine the operation and delay it's evaluation, also reuse predefined objects multiple times:

mtcars_sub <- quote(mtcars[like(rn,"Mer")])
mtcars_sub2 <- quote(eval(mtcars_sub)[mpg > 20])
eval(mtcars_sub2)
#           rn  mpg cyl  disp hp drat   wt qsec vs am gear carb
# 1: Merc 240D 24.4   4 146.7 62 3.69 3.19 20.0  1  0    4    2
# 2:  Merc 230 22.8   4 140.8 95 3.92 3.15 22.9  1  0    4    2

BTW. when subsetting data.table you don't need to use middle comma like dt[x==1,] you can use dt[x==1].



来源:https://stackoverflow.com/questions/32882768/subset-data-table-without-using

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!