I don\'t understand why the class of a vector is the class of the elements of the vector and not vector itself.
vector <- c(\"la\", \"la\", \"la\")
class
R needs to know the class of the object you are operating on to perform the appropriate method dispatch on that object. The atomic data type in R is a vector, there is no such thing as a scalar, i.e. R considers a single integer a length one vector; is.vector( 1L )
.
In order to dispatch the correct method R must know the datatype. It's not much using knowing that something is a vector, when your language is implicitly vectorised and everything is designed to operate on a vector.
is.vector( list( NULL , NULL ) )
is.vector( NA )
is.vector( "a" )
is.vector( c( 1.0556 , 2L ) )
So you can take the return value of class( 1L )
which is [1] "integer"
to mean, I am an atomic vector consisting of type integer
.
Despite the fact that under the hood a matrix
is actually just a vector with two dimension attributes, R must know it is a matrix so that it can operate row-wise or column-wise on the elements of the matrix (or individually on any single subscripted element). After subsetting, you will return a vector of the datatype of the elements in your matrix, which will allow R to dispatch the appropriate method for your data (e.g. performing sort
on a character vector or a numeric vector);
/* from the underlying C code in /src/main/subset.c....*/
result = allocVector(TYPEOF(x), (R_xlen_t) nrs * (R_xlen_t) ncs)
You should also note, that before determining the class of an object, R will always check that it is a first a vector, e.g. in the case of running is.matrix(x)
on some matrix, x
, R checks that it is first a vector, and then it checks for dimension attributes. If the dimension attributes is a vector of INTEGER
data types of LENGTH
2 it satisfies the conditions for that object being a matrix (the following code snippet is from Rinlinedfuns.h from /src/include/)
INLINE_FUN Rboolean isMatrix(SEXP s)
495 {
496 SEXP t;
497 if (isVector(s)) {
498 t = getAttrib(s, R_DimSymbol);
499 /* You are not supposed to be able to assign a non-integer dim,
500 although this might be possible by misuse of ATTRIB. */
501 if (TYPEOF(t) == INTSXP && LENGTH(t) == 2)
502 return TRUE;
503 }
504 return FALSE;
505 }
# e.g. create an array with height and width....
a <- array( 1:4 , dim=c(2,2) )
# this is a matrix!
class(a)
#[1] "matrix"
# And the class of the first column is an atomic vector of type integer....
class(a[,1])
[1] "integer"
Here's the best diagram I've found that lays out the class hierarchy used by the class
function:
Although the class names don't correspond exactly with the results of the R class
function, I believe the hierarchy is basically accurate. The key to your answer is that the class
function only gives the root class in the hierarchy.
You will see that Vector
is not a root class. The root class for your example would be StrVector
, which corresponds to the "character"
class, the class for a vector with character elements. In contrast, Matrix
is itself a root class; hence, its class is "matrix"
.
This is what I get from this. class
is mainly meant for object oriented programming and there are other functions in R which will give you the storage mod of an object (see ?typeof
or ?mode
).
When looking at ?class
Many R objects have a class attribute, a character vector giving the names of the classes from which the object inherits. If the object does not have a class attribute, it has an implicit class, "matrix", "array" or the result of mode(x)
It seems like class
works as follows
It first looks for a $class
attribute
If there isn't any, it checks if the object has a matrix
or an array
structure by checking the $dim
attribute (which is not present in a vector
)
2.1. if $dim
contains two entries, it will call it a matrix
2.2. if $dim
contains one entry or more than two entries, it will call it an array
2.3. if $dim
is of length 0, it goes to the next step (mode
)
$dim
is of length 0 and there is no $class
attribute, it performs mode
So per your example
mat <- matrix(rep("la", 3), ncol=1)
vec <- rep("la", 3)
attributes(vec)
# NULL
attributes(mat)
## $dim
## [1] 3 1
So you can see that vec
doesn't contain any attributes whatsoever (see ?c
or ?as.vector
for explanation)
So in first case, class
performs
attributes(vec)$class
# NULL
length(attributes(vec)$dim)
# 0
mode(vec)
## [1] "character"
In the second case it checks
attributes(mat)$class
# NULL
length(attributes(mat)$dim)
##[1] 2
It sees that the object has two dimensions and there for calls it matrix
In order to illustrate that both vec
and mat
have same storage mode, you can do
mode(vec)
## [1] "character"
mode(mat)
## [1] "character"
You can also see, for example, same behavior with an array
ar <- array(rep("la", 3), c(3, 1)) # two dimensional array
class(ar)
##[1] "matrix"
ar <- array(rep("la", 3), c(3, 1, 1)) # three dimensional array
class(ar)
##[1] "array"
So both array
and matrix
don't parse a class
attribute. Let's check, for example, what data.frame
does.
df <- data.frame(A = rep("la", 3))
class(df)
## [1] "data.frame"
Where did class
took it from?
attributes(df)
# $names
# [1] "A"
#
# $row.names
# [1] 1 2 3
#
# $class
# [1] "data.frame"
As you can see, data.fram
sets a $class
attribute, but this could be changed
attributes(df)$class <- NULL
class(df)
## [1] "list"
Why list
? Because data.frame
don't have a $dim
attribute (neither a $class
one, because we just deleted it), thus class
performs mode(df)
mode(df)
## [1] "list"
Lastly, in order to illustrate how class
works, we can manually set the class
to whatever we want and see what it will give us back
mat <- structure(mat, class = "vector")
vec <- structure(vec, class = "vector")
class(mat)
## [1] "vector"
class(vec)
## [1] "vector"
In the R language definition, there are six basic types of vector, one of which is "character"
. There really isn't a base "vector" type, but six different kinds of vectors that are all base types.
On the other hand, Matrix is a type of data structure.