Assign value to specific data.table columns and rows

不想你离开。 提交于 2019-12-02 18:16:24
Arun

First, it is recommended to use := instead of [<- for efficiency. The [<- is mostly provided for backward consistency. So, I'll first illustrate how to efficiently use := to get what you're after. := is assignment by reference (and it updates a data.table without copying the data, therefore extremely fast).

require(data.table)
DT <- data.table(x = 1:5, y = 6:10, z = 11:15)

Suppose you want to change the 2nd row of "y" to that of 5th row of "y":

DT[2, y := DT[5, y]] 

or equivalently

DT[2, `:=`(y = DT[5, y])]

Suppose you want to change the 2nd row of both "y" and "z" to that of the corresponding entries in row 5, then:

DT[2, c("y", "z") := as.list(DT[5, c(y, z)])]

or equivalently

DT[2, `:=`(y = DT[5, y], z = DT[5, z])]

Now just to show you how to assign using [<- (while it is clearly not recommended), it can be done as follows:

DT <- data.table(x = 1:5, y = 6:10, z = 11:15)
DT[1, c("y", "z")] <- as.list(DT[5, c(y, z)])

or equivalently, you can also pass the column number:

DT[1, 2:3] <- as.list(DT[5, c(y, z)])

Hope this helps.


Edit 1

As to why you get the error:

First, the RHS has to be a list for [<-data.table if it has more than 1 columns to be assigned to.

Second, j argument on the left of <- is not evaluated within the environment of your data.table. So, it needs to know what the values for j are. And since you provide var1 and var2 (without the double quotes that would make them a character vector), it is understood to be a variable. And so, it checks for variables var1 and var2, but since it doesn't "see" the columns within your data.table as variables (like it normally does when you do assignments etc on the RHS of <-), it'll look for the same variables in its parent environment which is the global environment where it doesn't find them and so you get the error. For ex: do this:

y <- "y"
z <- "z"
# And now try your second case: 
DT[2, c(y, z)] <- as.list(DT[5, c(y, z)])
# the left side takes values from the assignments you made above
# the right side y and z are evaluated within the environment of your data.table
# and so it sees the columns y and z as variables and their values are picked accordingly

Third, the [<-data.table function accepts only atomic (vector) types for j argument. So, your first assignment DT[2, list(var1, var2)] <- DT[8, list(var1, var2)] will still give an error if you do it the right way, that is:

y <- "y"
z <- "z"
DT[2, list(y, z)] <- as.list(DT[5, c(y, z)])

# Error in `[<-.data.table`(`*tmp*`, 2, list(y, z), value = list(10L, 15L)) : 
#   j must be atomic vector, see ?is.atomic

hope this helps.


Edit 2

Just to illustrate that a copy of your data.table is being made when you do [<- but not when :=,

DT <- data.table(x = 1:5, y = 6:10, z = 11:15)
tracemem(DT)
# [1] "<0x7fbefb89b580>"

DT[1, c("y", "z") := list(100L, 110L)]
tracemem(DT)
# [1] "<0x7fbefb89b580>"

DT[2, c("y", "z")] <- list(200L, 201L)
# tracemem[0x7fbefacc4fa0 -> 0x7fbefd297838]: # copied, inefficient
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!