readRDS() loads extra packages

我只是一个虾纸丫 提交于 2019-12-22 04:21:49

问题


Under what circumstances does the readRDS() function in R try to load packages/namespaces? I was surprised to see the following in a fresh R session:

> loadedNamespaces()
[1] "base"      "datasets"  "graphics"  "grDevices" "methods"   "stats"    
[7] "tools"     "utils"    
> x <- readRDS('../../../../data/models/my_model.rds')
There were 19 warnings (use warnings() to see them)
> loadedNamespaces()
 [1] "base"         "class"        "colorspace"   "data.table"  
 [5] "datasets"     "dichromat"    "e1071"        "earth"       
 [9] "evaluate"     "fields"       "formatR"      "gbm"         
[13] "ggthemes"     "graphics"     "grDevices"    "grid"        
[17] "Iso"          "knitr"        "labeling"     "lattice"     
[21] "lubridate"    "MASS"         "methods"      "munsell"     
[25] "plotmo"       "plyr"         "proto"        "quantreg"    
[29] "randomForest" "RColorBrewer" "reshape2"     "rJava"       
[33] "scales"       "spam"         "SparseM"      "splines"     
[37] "stats"        "stringr"      "survival"     "tools"       
[41] "utils"        "wra"          "wra.ops"      "xlsx"        
[45] "xlsxjars"     "xts"          "zoo"     

If any of those new packages aren't available, the readRDS() fails.

The 19 warnings mentioned are:

> warnings()
Warning messages:
1: replacing previous import ‘hour’ when loading ‘data.table’
2: replacing previous import ‘last’ when loading ‘data.table’
3: replacing previous import ‘mday’ when loading ‘data.table’
4: replacing previous import ‘month’ when loading ‘data.table’
5: replacing previous import ‘quarter’ when loading ‘data.table’
6: replacing previous import ‘wday’ when loading ‘data.table’
7: replacing previous import ‘week’ when loading ‘data.table’
8: replacing previous import ‘yday’ when loading ‘data.table’
9: replacing previous import ‘year’ when loading ‘data.table’
10: replacing previous import ‘here’ when loading ‘plyr’
11: replacing previous import ‘hour’ when loading ‘data.table’
12: replacing previous import ‘last’ when loading ‘data.table’
13: replacing previous import ‘mday’ when loading ‘data.table’
14: replacing previous import ‘month’ when loading ‘data.table’
15: replacing previous import ‘quarter’ when loading ‘data.table’
16: replacing previous import ‘wday’ when loading ‘data.table’
17: replacing previous import ‘week’ when loading ‘data.table’
18: replacing previous import ‘yday’ when loading ‘data.table’
19: replacing previous import ‘year’ when loading ‘data.table’

So apparently it's loading something like lubridate and then data.table, generating namespace conflicts as it goes.

FWIW, unserialize() gives the same results.

What I really want is to load these objects without also loading everything the person who saved them seemed to have loaded at the time, which is what it sort of looks like it's doing.

Update: here are the classes in the object x:

> classes <- function(x) {
    cl <- c()
    for(i in x) {
      cl <- c(cl, if(is.list(i)) c(class(i), classes(i)) else class(i))
    }
    cl
  }
> unique(classes(x))
 [1] "list"              "numeric"           "rq"               
 [4] "terms"             "formula"           "call"             
 [7] "character"         "smooth.spline"     "integer"          
[10] "smooth.spline.fit"

qr is from the quantreg package, all the rest are from base or stats.


回答1:


Ok. This may not be a useful answer (which would need more details) but I think it is at least an aswer to the "under what circumstances.." part.

First of all, I think it is not specific to readRDS but works the same way with any save'd objects that can be load'ed.

The "under what circumstances" part: when the saved object contains an environment having a package/namespace environment as a parent. Or when it contains a function whose environment is a package/namespace environment.

require(Matrix)
foo <- list(
   a = 1,
   b = new.env(parent=environment(Matrix)),
   c = "c")
save(foo, file="foo.rda")
loadedNamespaces()   # Matrix is there!
detach("package:Matrix")
unloadNamespace("Matrix")
loadedNamespaces()   # no Matrix there!
load("foo.rda")
loadedNamespaces()   # Matrix is back again

And the following works too:

require(Matrix)
bar <- list(
   a = 1,
   b = force,
   c = "c")
environment(bar$b) <- environment(Matrix)
save(bar, file="bar.rda")
loadedNamespaces()      # Matrix is there!
detach("package:Matrix")
unloadNamespace("Matrix")
loadedNamespaces()      # no Matrix there!
load("bar.rda")
loadedNamespaces()      # Matrix is back!

I haven't tried but there's no reason why it shouldn't work the same way with saveRDS/readRDS. And the solution: if that does no harm to the saved objects (i.e., if you're sure that the environments are actually not needed), you can remove the parent environments by replacing them e.g. by setting the parent.env to something that makes sense. So using the foo above,

parent.env(foo$b) <- baseenv()
save(foo, file="foo.rda")
loadedNamespaces()        # Matrix is there ....
unloadNamespace("Matrix")
loadedNamespaces()        # no Matrix there ...
load("foo.rda")
loadedNamespaces()        # still no Matrix ...



回答2:


One painful workaround I've come up with is to cleanse the object of any environments it had attached to it, by a nasty eval:

sanitizeEnvironments <- function(obj) {
    tc <- textConnection(NULL, 'w')
    dput(obj, tc)
    source(textConnection(textConnectionValue(tc)))$value
}

I can take the old object, run it through this function, then do saveRDS() on it again. Then loading the new object doesn't blow chunks all over my namespace.



来源:https://stackoverflow.com/questions/19146684/readrds-loads-extra-packages

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!