问题
If I need to treat R objects in different ways according to their class, I can either use if
and else
within a single function:
foo <- function (x) {
if (inherits(x, 'list')) {
# Foo the list
} else if (inherits(x, 'numeric')) {
# Foo the numeric
} else {
# Throw an error
}
}
Or I can define a method:
foo <- function (x) UseMethod('foo')
foo.list <- function (x) {
# Foo the list
}
foo.numeric <- function (x) {
# Foo the numeric
}
What are the advantages to each approach? Are there performance implications?
回答1:
OK, there is some background to be covered to answer this question (in my view)...
Within R, the class of an object is explicit in situations where you have user-defined object structures or an object such as a factor vector or data frame where other attributes play an important part in the handling of the object itself—for example, level labels of a factor vector, or variable names in a data frame, are modifiable attributes that play a primary role in accessing the observations of each object.
Note, however, that elementary R objects such as vectors, matrices, and arrays, are implicitly classed, which means the class is not identified with the attributes function. Whether implicit or explicit, the class of a given object can always be retrieved using the attribute-specific function class.
When a generic function foo
is applied to an object
with class attribute c("first", "second"), the system searches for a function called foo.first and, if it finds it, applies it to the object. If no such function is found, a function called foo.second
is tried. If no class name produces a suitable function, the function foo.default
is used (if it exists). If there is no class attribute, the implicit class is tried, then the default
method.
The function class prints the vector of names of classes an object inherits from.
class
<- sets the classes an object inherits from.
inherits() indicates whether its first argument inherits from any of the classes specified in the what argument. Method dispatch takes place based on the class of the first argument to the generic function. If which is TRUE then an integer vector of the same length as what is returned. Each element indicates the position in the class(x) matched by the element of what; zero indicates no match. If which is FALSE then TRUE is returned by inherits if any of the names in what match with any class.
All but inherits() are primitive functions.
Considerations
OK, so let us now consider your examples in reverse order...
foo <- function (x) UseMethod('foo')
foo.list <- function (x) {
# Foo the list
}
foo.numeric <- function (x) {
# Foo the numeric
}
now if we use the function methods()
methods(foo)
[1] foo.list foo.numeric
see '?methods' for accessing help and source code
> getS3method('foo','list')
function (x) {
# Foo the list
}
thus we have a class foo
and two associated methods foo.list
and foo.numeric
. Thus, we now know that class foo
, has methods to support list
and numeric
operations.
OK, now let's consider your first example...
function (x) {
if (inherits(x, 'list')) {
# Foo the list
print(paste0("List: ", x))
} else if (inherits(x, 'numeric')) {
# Foo the numeric
print(paste0("Numeric: ", x))
} else {
# Throw an error
print(paste0("Unhandled - Sorry!"))
}
}
the problem is that this is not an s3 class, it is an R function. If you run methods()
against foo
it returns "no methods found"
> methods(foo)
no methods found
> getS3method('foo','list')
Error in getS3method("foo", "list") : no function 'foo' could be found
so what is happening in the second example? The inherits() operation is matching the class of the parameter. inherits() -> Method dispatch takes place based on the class of the first argument to the generic function.
So your first example is simply looking up the class of the function argument x, no S3 class is created or exists.
What are the advantages to each approach? Are there performance implications?
OK, I am biased here but an object’s class is one of the most useful attributes for describing an entity in R. Every object you create is identified, either implicitly or explicitly, with at least one class. R is an object-oriented programming language, meaning entities are stored as objects and have methods that act upon them.
So the second approach is the way to go in my opinion. Why? Because you are truly using the language construct as intended. The first approach where you use inherits() explicitly feels like a hack. Readability is key to comprehension from my personal perspective, thus I worry that a person reading the first example might be led to ask the question "Why did they (the programmer) take said approach, what am I missing?". My concern then is that complexity is to be avoided as it can impede code comprehension. Thus, keep it simple is advantageous to code comprehension.
In reference to code performance, an if-else parser is generally going to be faster than an object lookup model though a lookup model is not equivalent to a class mapping process so I feel the performance question is tricky to answer in this context. Why? The two approaches are different.
I hope the above points you in the right direction. Stay safe, good karma flying your way.
A couple of Book recommendations here:
- R Inferno by Patrick Burns
- Advanced R by Hadley Wickham
- R for Everyone: Advanced Analytics and Graphics
来源:https://stackoverflow.com/questions/61322898/usemethod-vs-inherits-to-determine-an-objects-class-in-r