问题
I am looking for a simple "set it and forget it" way, either as a single string of arguments in terminal or a simple Java program, to automate the following:
1) start an R session
2) tell R to source .R files that contain code for lengthy, parallelized simulations
3) terminate the R session upon completion
4) start a new R session
5) tell R to source other .R files
6) terminate the R session upon completion
7) lather, rinse, repeat
My .R scripts will take a total of several days to run, during which I will be out of town and unable to check on them, and if I run them all in the same session there is no possible way for me to avoid maxing out my available RAM.
Thanks!
EDIT: I am running R 2.15.3 on Ubuntu 12.04 LTS, with 16GB RAM
回答1:
The process of starting and terminating R sessions is handled by using Rscript. So write your scripts call them like this:
Rscript script_1.R
Rscript script_2.R
...
Rscript script_Inf.R
That leaves points 2 and 5... which is a simple matter of putting:
source('/home/sc_evans/script_abc.R')
...at the head of whatever script(s).
Each script will get its own R session that will terminate upon completion. Put those commands in a batch script and run it.
EDIT
If I were to do this myself I would forget about using separate scripts though. As long as you're managing memory properly running a single process should work out alright. Divide your processes into appropriate functions:
massive_process_1 <- function() {
x <- do_something()
saveRDS(x, '/home/sc_evans/results/first_result.rds')
}
massive_process_2 <- function() {
x <- do_something()
saveRDS(x, '/home/sc_evans/results/second_result.rds')
}
massive_process_1()
massive_process_2()
And so on. The next function won't run until the first is completed, and your objects should die in the functions so you shouldn't run out of memory.
回答2:
You may look at this solution http://www.rforge.net/JRI
来源:https://stackoverflow.com/questions/15837617/writing-code-to-start-an-r-session-run-r-script-terminate-session-repeat