Why is the class of a vector the class of the elements of the vector and not vector itself?

前端 未结 4 1879
时光说笑
时光说笑 2020-12-16 02:48

I don\'t understand why the class of a vector is the class of the elements of the vector and not vector itself.

vector <- c(\"la\", \"la\", \"la\")
class         


        
相关标签:
4条回答
  • 2020-12-16 03:32

    R needs to know the class of the object you are operating on to perform the appropriate method dispatch on that object. The atomic data type in R is a vector, there is no such thing as a scalar, i.e. R considers a single integer a length one vector; is.vector( 1L ).

    In order to dispatch the correct method R must know the datatype. It's not much using knowing that something is a vector, when your language is implicitly vectorised and everything is designed to operate on a vector.

    is.vector( list( NULL , NULL ) )
    is.vector( NA )
    is.vector( "a" )
    is.vector( c( 1.0556 , 2L ) )
    

    So you can take the return value of class( 1L ) which is [1] "integer" to mean, I am an atomic vector consisting of type integer.

    Despite the fact that under the hood a matrix is actually just a vector with two dimension attributes, R must know it is a matrix so that it can operate row-wise or column-wise on the elements of the matrix (or individually on any single subscripted element). After subsetting, you will return a vector of the datatype of the elements in your matrix, which will allow R to dispatch the appropriate method for your data (e.g. performing sort on a character vector or a numeric vector);

    /* from the underlying C code in /src/main/subset.c....*/
    result = allocVector(TYPEOF(x), (R_xlen_t) nrs * (R_xlen_t) ncs)
    

    You should also note, that before determining the class of an object, R will always check that it is a first a vector, e.g. in the case of running is.matrix(x) on some matrix, x, R checks that it is first a vector, and then it checks for dimension attributes. If the dimension attributes is a vector of INTEGER data types of LENGTH 2 it satisfies the conditions for that object being a matrix (the following code snippet is from Rinlinedfuns.h from /src/include/)

    INLINE_FUN Rboolean isMatrix(SEXP s)
      495 {
      496     SEXP t;
      497     if (isVector(s)) {
      498    t = getAttrib(s, R_DimSymbol);
      499    /* You are not supposed to be able to assign a non-integer dim,
      500       although this might be possible by misuse of ATTRIB. */
      501    if (TYPEOF(t) == INTSXP && LENGTH(t) == 2)
      502        return TRUE;
      503     }
      504     return FALSE;
      505 }
    
    #  e.g. create an array with height and width....  
    a <- array( 1:4 , dim=c(2,2) )
    
    #  this is a matrix!
    class(a)
    #[1] "matrix"
    
    # And the class of the first column is an atomic vector of type integer....
    class(a[,1])
    [1] "integer"
    
    0 讨论(0)
  • 2020-12-16 03:33

    Here's the best diagram I've found that lays out the class hierarchy used by the class function:

    Although the class names don't correspond exactly with the results of the R class function, I believe the hierarchy is basically accurate. The key to your answer is that the class function only gives the root class in the hierarchy.

    You will see that Vector is not a root class. The root class for your example would be StrVector, which corresponds to the "character" class, the class for a vector with character elements. In contrast, Matrix is itself a root class; hence, its class is "matrix".

    0 讨论(0)
  • 2020-12-16 03:34

    This is what I get from this. class is mainly meant for object oriented programming and there are other functions in R which will give you the storage mod of an object (see ?typeof or ?mode).

    When looking at ?class

    Many R objects have a class attribute, a character vector giving the names of the classes from which the object inherits. If the object does not have a class attribute, it has an implicit class, "matrix", "array" or the result of mode(x)

    It seems like class works as follows

    1. It first looks for a $class attribute

    2. If there isn't any, it checks if the object has a matrix or an array structure by checking the $dim attribute (which is not present in a vector)

      2.1. if $dim contains two entries, it will call it a matrix

      2.2. if $dim contains one entry or more than two entries, it will call it an array

      2.3. if $dim is of length 0, it goes to the next step (mode)

    3. if $dim is of length 0 and there is no $class attribute, it performs mode

    So per your example

    mat <- matrix(rep("la", 3), ncol=1)
    vec <- rep("la", 3)
    attributes(vec)
    # NULL
    attributes(mat)
    ## $dim
    ## [1] 3 1
    

    So you can see that vec doesn't contain any attributes whatsoever (see ?c or ?as.vector for explanation)

    So in first case, class performs

    attributes(vec)$class
    # NULL
    length(attributes(vec)$dim)
    # 0
    mode(vec)
    ## [1] "character"
    

    In the second case it checks

    attributes(mat)$class
    # NULL
    length(attributes(mat)$dim)
    ##[1] 2
    

    It sees that the object has two dimensions and there for calls it matrix

    In order to illustrate that both vec and mat have same storage mode, you can do

    mode(vec)
    ## [1] "character"
    mode(mat)
    ## [1] "character"
    

    You can also see, for example, same behavior with an array

    ar <- array(rep("la", 3), c(3, 1)) # two dimensional array
    class(ar)
    ##[1] "matrix"
    ar <- array(rep("la", 3), c(3, 1, 1)) # three dimensional array
    class(ar)
    ##[1] "array"
    

    So both array and matrix don't parse a class attribute. Let's check, for example, what data.frame does.

    df <- data.frame(A = rep("la", 3))
    class(df)
    ## [1] "data.frame"
    

    Where did class took it from?

    attributes(df)    
    # $names
    # [1] "A"
    # 
    # $row.names
    # [1] 1 2 3
    # 
    # $class
    # [1] "data.frame"
    

    As you can see, data.fram sets a $class attribute, but this could be changed

    attributes(df)$class <- NULL
    class(df)
    ## [1] "list"
    

    Why list? Because data.frame don't have a $dim attribute (neither a $class one, because we just deleted it), thus class performs mode(df)

    mode(df)
    ## [1] "list"
    

    Lastly, in order to illustrate how class works, we can manually set the class to whatever we want and see what it will give us back

    mat <- structure(mat, class = "vector")
    vec <- structure(vec, class = "vector")
    class(mat)
    ## [1] "vector"
    class(vec)
    ## [1] "vector"
    
    0 讨论(0)
  • 2020-12-16 03:55

    In the R language definition, there are six basic types of vector, one of which is "character". There really isn't a base "vector" type, but six different kinds of vectors that are all base types.

    On the other hand, Matrix is a type of data structure.

    0 讨论(0)
提交回复
热议问题