:= (pass by reference) operator in the data.table package modifies another data table object simultaneously

久未见 提交于 2019-12-28 03:01:06

问题


While testing my code, I found out the following: If I assign a data.table DT1 to DT and change DT afterwards, DT1 changes with it. So DT and DT1 seem to be internally linked. Is this intended behavior? Although I'm not a programming expert, this looks wrong to me, and testing it with simple R variables or a data.frame, I couldn't reproduce the behavior. What's happening here?

DF <- data.frame(ID=letters[1:5],
                  value=1:5)
DF1 <- DF
all.equal(DF1, DF)
[1] TRUE
DF[1, "value"] <- DF[1, "value"]*2
all.equal(DF1, DF)
[1] "Component 2: Mean relative difference: 1"

library(data.table)
data.table 1.7.1  For help type: help("data.table")
DT <- data.table(ID=letters[1:5],
                  value=1:5)
DT1 <- DT
all.equal(DT1, DT)
[1] TRUE
DT[, value:=value*2]
     ID value
[1,]  a     2
[2,]  b     4
[3,]  c     6
[4,]  d     8
[5,]  e    10
all.equal(DT1, DT)
[1] TRUE

回答1:


This piece of documentation in data.table would help. ? data.table::copy

No value is returned. The data.table is modified by reference. If you require a copy, take a copy first (using DT2=copy(DT)). copy() may also sometimes be useful before := is used to subassign to a column by reference.



来源:https://stackoverflow.com/questions/8030452/pass-by-reference-operator-in-the-data-table-package-modifies-another-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!