I am in the process of creating a package that uses a data.table as a dataset and has a couple of functions which assign by reference using :=.
Another solution is to use inst/extdata to save the rda file (which would contain any number of data.table objects) and have a file DT.r within the data subdirectory
# get the environment from the call to `data()`
env <- get('envir', parent.frame(1))
# load the data
load(system.file('extdata','DT.rda', package= 'foo'), envir = env)
# overallocate (evaluating in correct environment)
if(require(data.table)){
# the contents of `DT.rda` are known, so write out in full
evalq(alloc.col(DT), envir = env)
}
# clean up so `env` object not present in env environment after calling `data(DT)`
rm(list = c('env'), envir = env)
}
This has nothing to do with datasets or locking -- you can reproduce it simply using
DT<-unserialize(serialize(data.table(b = 1:5),NULL))
foo(DT)
DT
I suspect it has to do with the fact that data.table has to re-create the extptr inside the object on the first access on DT, but it's doing so on a copy so there is no way it can share the modification with the original in the global environment.
[From Matthew] Exactly.
DT<-unserialize(serialize(data.table(b = 1:3),NULL))
DT
b
1: 1
2: 2
3: 3
DT[,newcol:=42]
DT # Ok. DT rebound to new shallow copy (when direct)
b newcol
1: 1 42
2: 2 42
3: 3 42
DT<-unserialize(serialize(data.table(b = 1:3),NULL))
foo(DT)
b a
1: 1 1
2: 2 1
3: 3 1
DT # but not ok when via function foo()
b
1: 1
2: 2
3: 3
DT<-unserialize(serialize(data.table(b = 1:3),NULL))
alloc.col(DT) # alloc.col needed first
b
1: 1
2: 2
3: 3
foo(DT)
b a
1: 1 1
2: 2 1
3: 3 1
DT # now it's ok
b a
1: 1 1
2: 2 1
3: 3 1
Or, don't pass DT into the function, just refer to it directly. Use data.table like a database: a few fixed name tables in .GlobalEnv.
DT <- unserialize(serialize(data.table(b = 1:5),NULL))
foo <- function() {
DT[, newcol := 7]
}
foo()
b newcol
1: 1 7
2: 2 7
3: 3 7
4: 4 7
5: 5 7
DT # Unserialized data.table now over-allocated and updated ok.
b newcol
1: 1 7
2: 2 7
3: 3 7
4: 4 7
5: 5 7