R writing style - require vs. ::

白昼怎懂夜的黑 提交于 2019-12-18 12:49:46

问题


OK, we're all familiar with double colon operator in R. Whenever I'm about to write some function, I use require(<pkgname>), but I was always thinking about using :: instead. Using require in custom functions is better practice than library, since require returns warning and FALSE, unlike library, which returns error if you provide a name of non-existent package.

On the other hand, :: operator gets the variable from the package, while require loads whole package (at least I hope so), so speed differences came first to my mind. :: must be faster than require.

And I did some analysis in order to check that - I've written two simple functions that load read.systat function from foreign package, with require and :: respectively, hence import Iris.syd dataset that ships with foreign package, replicated functions 1000 times each (which was shamelessly arbitrary), and... crunched some numbers.

Strangely (or not) I found significant differences in terms of user CPU and elapsed time, while there were no significant differences in terms of system CPU. And yet more strange conclusion: :: is actually slower! Documentation for :: is very blunt, and just by looking at sources it's obvious that :: should perform better!

require

#!/usr/local/bin/r

## with require
fn1 <- function() {
  require(foreign)
  read.systat("Iris.syd", to.data.frame=TRUE)
}

## times
n <- 1e3

sink("require.txt")
print(t(replicate(n, system.time(fn1()))))
sink()

double colon

#!/usr/local/bin/r

## with ::
fn2 <- function() {
  foreign::read.systat("Iris.syd", to.data.frame=TRUE)
}

## times
n <- 1e3


sink("double_colon.txt")
print(t(replicate(n, system.time(fn2()))))
sink()

Grab CSV data here. Some stats:

user CPU:     W = 475366    p-value = 0.04738  MRr =  975.866    MRc = 1025.134
system CPU:   W = 503312.5  p-value = 0.7305   MRr = 1003.8125   MRc =  997.1875
elapsed time: W = 403299.5  p-value < 2.2e-16  MRr =  903.7995   MRc = 1097.2005

MRr is mean rank for require, MRc ibid for ::. I must have done something wrong here. It just doesn't make any sense... Execution time for :: seems way faster!!! I may have screwed something up, you shouldn't discard that option...

OK... I've wasted my time in order to see that there is some difference, and I carried out completely useless analysis, so, back to the question:

"Why should one prefer require over :: when writing a function?"

=)


回答1:


"Why should one prefer require over :: when writing a function?"

I usually prefer require due to the nice TRUE/FALSE return value that lets me deal with the possibility of the package not being available up front before getting into the code. Crash as early as possible instead of halfway through your analysis.

I only use :: when I need to make sure I am using the correct version of a function, not a version from some other package that is masking the name.

On the other hand, :: operator gets the variable from the package, while require loads whole package (at least I hope so), so speed differences came first to my mind. :: must be faster than require.

I think you may be ignoring the effects of lazy loading which is used by the foreign package according to the first page of its manual. Essentially, packages that use lazy loading defer the loading of objects, such as functions, until the objects are called upon for the first time. So your argument that ":: must be faster than require" is not necessarily true as foreign is not loading all of its contents into memory when you attach it with require. For full details on lazy loading, see Prof. Ripley's article in RNews, Volume 4, Issue 2.




回答2:


Since the time to load a package is almost always small compared to the time you spend trying to figure out what the code you wrote six months ago was about, in this case coding for clarity is the most important thing.

For scripts, having a call to require or library at the start lets you know which packages you need straight away.

Similarly, calling require (or a wrapper like requirePackage in Hmisc or try_require in ggplot2) at the start of a function is the most unambiguous way of showing that you need to use that package.

:: should be reserved for cases when you have naming conflicts between packages – compare, e.g.,

Hmisc::is.discrete

and

plyr::is.discrete


来源:https://stackoverflow.com/questions/4372145/r-writing-style-require-vs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!