Exceeded maximum number of DLLs in R

喜夏-厌秋 提交于 2019-12-08 15:28:40

问题


I am using RStan to sample from a large number of Gaussian Processes (GPs), i.e., using the function stan(). For every GP that I fit, another DLL gets loaded, as can be seen by running the R command

getLoadedDLLs()

The problem I'm running into is that, because I need to fit so many unique GPs, I'm exceeding the maximum number of DLLs that can be loaded, at which point I receive the following error:

Error in dyn.load(libLFile) : 
unable to load shared object '/var/folders/8x/n7pqd49j4ybfhrm999z3cwp81814xh/T//RtmpmXCRCy/file80d1219ef10d.so':
maximal number of DLLs reached...

As far as I can tell, this is set in Rdynload.c of the base R code, as follows:

#define MAX_NUM_DLLS 100

So, my question is, what can be done to fix this? Building R from source with a larger MAX_NUM_DLLS isn't an option, as my code will be run by collaborators who wouldn't be comfortable with that process. I've tried the naive approach of just unloading DLLs using dyn.unload() in the hopes that they'd just be reloaded when they're needed again. The unloading works fine, but when I try to use the fit again, R fairly unsurprisingly crashes with an error like:

*** caught segfault ***
address 0x121366da8, cause 'memory not mapped'

I've also tried detaching RStan in the hopes that the DLLs would be automatically unloaded, but they persist even after unloading the package (as expected, given the following in the help for detach: "detaching will not in general unload any dynamically loaded compiled code (DLLs)").

From this question, Can Rcpp package DLLs be unloaded without restarting R?, it seems that library.dynam.unload() might have some role in the solution, but I haven't had any success using it to unload the DLLs, and I suspect that after unloading the DLL I'd run into the same segfault as before.

EDIT: adding a minimal, fully-functional example:

The R code:

require(rstan)

x <- c(1,2)
N <- length(x)

fits <- list()
for(i in 1:100)
{
    fits[i] <- stan(file="gp-sim.stan", data=list(x=x,N=N), iter=1, chains=1)
}

This code requires that the following model definition be in the working directory in a file gp-sim.stan (this model is one of the examples included with Stan):

// Sample from Gaussian process
// Fixed covar function: eta_sq=1, rho_sq=1, sigma_sq=0.1

data {
  int<lower=1> N;
  real x[N];
}
transformed data {
   vector[N] mu;
   cov_matrix[N] Sigma;
   for (i in 1:N) 
     mu[i] <- 0;
   for (i in 1:N) 
     for (j in 1:N)
       Sigma[i,j] <- exp(-pow(x[i] - x[j],2)) + if_else(i==j, 0.1, 0.0);
 }
 parameters {
   vector[N] y;
 }
 model {
   y ~ multi_normal(mu,Sigma);
 }

Note: this code takes quite some time to run, as it is creating ~100 Stan models.


回答1:


I can't speak for the issues regarding dlls, but you shouldn't need to compile the model each time. You can compile the model once and reuse it, which won't cause this problem and it will speed up your code.

The function stan is a wrapper for stan_model which compiles the model and the sampling method which draws samples from the model. You should run stan_model once to compile the model and save it to an object, and then use the sampling method on that object to draw samples.

require(rstan)

x <- c(1,2)
N <- length(x)

fits <- list()
mod <- stan_model("gp-sim.stan")
for(i in 1:100)
{
    fits[i] <- sampling(mod, data=list(x=x,N=N), iter=1, chains=1)
}

This is similar to the problem of running parallel chains, discussed in the Rstan wiki. Your code could by sped up by replace the for loop with something that processes the sampling in parallel.




回答2:


Here is, what I use to run several stan models in a row (Win10, R 3.3.0).

I needed to not only unload the dll-files but also delete them and other temporary files. Then, the filename for me was different than found in the stan object, as Ben suggested.

 dso_filenames <- dir(tempdir(), pattern=.Platform$dynlib.ext)
  filenames  <- dir(tempdir())
  for (i in seq(dso_filenames))
    dyn.unload(file.path(tempdir(), dso_filenames[i]))
  for (i in seq(filenames))
    if (file.exists(file.path(tempdir(), filenames[i])) & nchar(filenames[i]) < 42) # some files w/ long filenames that didn't like to be removeed
      file.remove(file.path(tempdir(), filenames[i]))


来源:https://stackoverflow.com/questions/24832030/exceeded-maximum-number-of-dlls-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!