What are the suggested practices for function polymorphism in R?

问题

Suppose I want to write a function in R which is a function of a couple of sufficient statistics on some data. For example, suppose the function, call it foo.func depends only on the sample mean of a sample of data. For convenience, I think users might like to pass to foo.func the sample of random variables (in which case foo.func computes the sample mean), or the sample mean itself, which is all that foo.func needs. For reasons of efficiency, the latter is preferred if there are multiple functions like foo.func being called which can take the sample mean. In that case the mean need only be computed once (in the real problem I have, the sample statistics in question might be computationally intensive).

In summary, I would like to write foo.func to be accessible to the beginner (pass in the data, let the function compute the sufficient statistics) as well as the expert (precompute the sufficient statistics for efficiency and pass them in). What are the recommended practices for this? Do I have a logical flag passed in? Multiple arguments? Some ways to do it might be:

#optional arguments
foo.func <- function(xdata, suff.stats=NULL) {
  if (is.null(suff.stats)) {
    suff.stats <- compute.suff.stats(x)
  }
  #now operate on suff.stats
}

#flag input
foo.func <- function(data.or.stat, gave.data=TRUE) {
  if (gave.data) {
    data.or.stat <- compute.suff.stats(data.or.stat)
  }
  #now operate on data.or.stat
}

I am leaning towards the former, I think

回答1:

You can also embed functions into the arguments, as:

foo.func <- function(x, suff.stats = foo.func.suff.stat(x)){
  # your code here
}

As an example:

foo.func <- function(x, avg = mean(x)){
  return(avg)
}

foo.func(1:20)
foo.func(avg = 42)

Alternatively, you can either use a default setting of NULL for various arguments, and test for is.null(argument), or simply check the value of missing(argument) for each for each argument you might calculate.

Update 1: I erred in suggesting use of a default value of NA: it is far more appropriate to use NULL. Using NA and is.na() will behave oddly for vector inputs, whereas NULL is just a single object - one cannot create a vector of NULL values, so is.null(argument) behaves as expected. Apologies for the forgetfulness.

回答2:

The R way of implementing polymorphism is through a CLOS (Common Lisp's OO) model where methods are associated with generic functions (verbs) rather than classes (nouns). For instance,

# suprising that there is not an equivalent function in R
# to propagate inheritance...
addclass <- function(x,classname) structure(x,class=append(class(x),classname))

# this should be your main function that does stuff
# here, the identity function is assigned for example
dostuff <- identity

# define generic function and methods
foo <- function(x,...) UseMethod("foo")
foo.raw <- function(x,...) dostuff(mean(x))
foo.stats <- function(x,...) dostuff(x)

# define two types of inputs
x <- 1:10
x <- addclass(x,"raw")

y <- 5
y <- addclass(y,"stats")

# apply
foo(x)
# [1] 5.5
foo(y)
# [1] 5
# attr(,"class")
# [1] "numeric" "stats"

The example was using R's S3 OOP model, which I think are quite sufficient; S4 is more modern and safe but adds a lot of boilerplate.

来源：https://stackoverflow.com/questions/7932808/what-are-the-suggested-practices-for-function-polymorphism-in-r

标签

polymorphism