问题
Approach 1
f1 <- function(x)
{
# Do calculation xyz ....
f2 <- function(y)
{
# Do stuff...
return(some_object)
}
return(f2(x))
}
Approach 2
f2 <- function(y)
{
# Do stuff...
return(some_object)
}
f3 <- function(x)
{
# Do calculation xyz ....
return(f2(x))
}
Assume f1 and f3 both do the same calculations and give the same result.
Are there any significant advantages in using approach 1, calling f1(), vs approach 2, calling f3()?
Is a certain approach more favourable when:
large data is being passed in and/or out of
f2?Speed is a big issue. E.g.
f1orf3are called repeatedly in simulations.
(Approach 1 seems common in packages, defining inside another)
One advantage of using the approach f1 is that f2 won't exist outside f1 once f1 has finished being called (and f2 is only called in f1 or f3).
回答1:
Benefits of defining f2 inside f1:
f2only visible withinf1, useful iff2is only meant for use withinf1, though within package namespaces this is debatable since you just wouldn't exportf2if you defined it outsidef2has access to variables withinf1, which could be considered a good or a bad thing:- good, because you don't have to pass variables through the function interface and you can use
<<-to implement stuff like memoization, etc. - bad, for the same reasons...
- good, because you don't have to pass variables through the function interface and you can use
Disadvantages:
f2needs to be redefined every time you callf1, which adds some overhead (not very much overhead, but definitely there)
Data size should not matter since R won't copy the data unless it is being modified under either scenario. As noted in disadvantages, defining f2 outside of f1 should be a little faster, especially if you are repeating an otherwise relatively low overhead operation many times. Here is an example:
> fun1 <- function(x) {
+ fun2 <- function(x) x
+ fun2(x)
+ }
> fun2a <- function(x) x
> fun3 <- function(x) fun2a(x)
>
> library(microbenchmark)
> microbenchmark(
+ fun1(TRUE), fun3(TRUE)
+ )
Unit: nanoseconds
expr min lq median uq max neval
fun1(TRUE) 656 674.5 728.5 859.5 17394 100
fun3(TRUE) 406 434.5 480.5 563.5 1855 100
In this case we save 250ns (edit: the difference is actually 200ns; believe it or not the extra set of {} that fun1 has costs another 50ns). Not much, but can add up if the interior function is more complex or you repeat the function many many times.
回答2:
You would typically use approach 2. Some exceptions are
Function closures:
f = function() { counter = 1 g = function() { counter <<- counter + 1 return(counter) } } counter = f() counter() counter()Function closure enable us to remember the state.
Sometimes it's handy to only define functions as they are only used in one place. For example, when using
optim, we often tweak an existing function. For example,pdf = function(x, mu) dnorm(x, mu, log=TRUE) f = function(d, lower, initial=0) { ll = function(mu) { if(mu < lower) return(-Inf) else -sum(pdf(d, mu)) } optim(initial, ll) } f(d, 1.5)The
llfunction uses the data setdand a lower bound. This is both convenient since this may be the only time we use/need thellfunction.
来源:https://stackoverflow.com/questions/28331499/what-are-the-benefits-of-defining-and-calling-a-function-inside-another-function