Make R package easy to update with new files from users

孤者浪人 提交于 2019-12-13 21:25:01

问题


First let me explain that I come from the Python world, where I can do what I want like this in the shell:

$ export PYTHONPATH=~/myroot
$ mkdir -p ~/myroot/mypkg
$ touch ~/myroot/mypkg/__init__.py # this is the one bit of "magic" for Python
$ echo 'hello = "world"' > ~/myroot/mypkg/mymodule.py

Then in Python:

>>> import mypkg.mymodule
>>> mypkg.mymodule.hello
'world'

What I did there was to create a package which is easily extended by other users. I can check in ~/myroot/mypkg to source control and other users can later add modules to it using just a text editor. Now I want to do the equivalent thing in R. Here's what I have so far:

$ export R_LIBS=~/myR # already this is bad: it makes install.packages() put things here!
$ mkdir -p ~/myR
$ echo 'hello = "world"' > /tmp/mycode.R

Now in R:

> package.skeleton(name="mypkg", code_files="/tmp/mycode.R")

Now back to the shell:

$ R CMD build mypkg
$ R CMD INSTALL mypkg

Now back to R:

> library(mypkg)
> hello
"world"

So that works. But now how do my colleagues add new modules to this package? I want them to be able to just go in and add a new R file, but it seems like we then have to redo the entire package building process, which is tedious. And it seems like we will then end up checking in many generated files. But most importantly, where did the code go? Once I did CMD INSTALL, R knew what my code was, but it did not put the literal text (from mycode.R) anywhere under $R_LIBS (did it "compile" the code? I'm not sure).

Previously we would just source() our "modules" but this is not very good because it reloads the code every time, so indirect (transitive) dependencies end up reloading stuff that is already loaded.

My question is, how do people manage simple, in-house, collaboratively edited, source-controlled, non-binary, non-compiled, shared code in R?

I'm using R 3.1.1 on Linux. If the solution works on Windows too that would be nice.


回答1:


It seems R has nothing like Python's import statement, so I made my own. Just put this in a file like import.r and source it via your $R_PROFILE.

# this is sort of like Python's import statement, and lets us avoid redundant sourcing

.imports <- c("import") # module names imported so far, to avoid redundant imports (never import ourselves)

.importScriptPath <- function() {
  # returns the path of the executing script
  # see http://stackoverflow.com/questions/1815606

  # this will only work if the caller was loaded with source()
  filePath <- sys.frame(2)$ofile

  # if the caller was not loaded with source(), use the main script path
  if (length(filePath) == 0) {
    argv <- commandArgs(trailingOnly = FALSE)
    filePath <- substring(argv[grep("--file=", argv)], 8)
  }

  return (dirname(filePath))
}

import <- function(module) {
  # locates the given module (character or token), calls source() on it, and does nothing on subsequent calls

  module <- as.character(substitute(module)) # support import(foo) not only import("foo")

  if (module %in% .imports) {
    return(invisible())
  }

  moduleFilename <- paste0(gsub("\\.", "/", module), ".r") # allow import(foo.bar) as import("foo/bar")
  importPaths <- c(.importScriptPath()) # add more search paths here as desired

  for (importPath in importPaths) {
    modulePath <- file.path(importPath, moduleFilename)
    if (file.exists(modulePath)) {
      source(modulePath)
      .imports <<- append(.imports, module) # <<- updates the global variable so we skip it next time
      return(invisible())
    }
  }

  # last chance: try to load module as a standard library
  suppressPackageStartupMessages(library(module, character.only = TRUE))
  .imports <<- append(.imports, module)
  return(invisible())
}


来源:https://stackoverflow.com/questions/25050275/make-r-package-easy-to-update-with-new-files-from-users

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!