Why is the class of a vector the class of the elements of the vector and not vector itself?

前端未结

关注

 4  1901

I don\'t understand why the class of a vector is the class of the elements of the vector and not vector itself.

vector <- c(\"la\", \"la\", \"la\")
class


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  隐瞒了意图╮        
                
              
                            
                2020-12-16 03:32
              
            
            
                                                                       
R needs to know the class of the object you are operating on to perform the appropriate method dispatch on that object. The atomic data type in R is a vector, there is no such thing as a scalar, i.e. R considers a single integer a length one vector; is.vector( 1L ).

In order to dispatch the correct method R must know the datatype. It's not much using knowing that something is a vector, when your language is implicitly vectorised and everything is designed to operate on a vector. 

is.vector( list( NULL , NULL ) )
is.vector( NA )
is.vector( "a" )
is.vector( c( 1.0556 , 2L ) )


So you can take the return value of class( 1L ) which is [1] "integer" to mean, I am an atomic vector consisting of type integer.

Despite the fact that under the hood a matrix is actually just a vector with two dimension attributes, R must know it is a matrix so that it can operate row-wise or column-wise on the elements of the matrix (or individually on any single subscripted element). After subsetting, you will return a vector of the datatype of the elements in your matrix, which will allow R to dispatch the appropriate method for your data (e.g. performing sort on a character vector or a numeric vector);

/* from the underlying C code in /src/main/subset.c....*/
result = allocVector(TYPEOF(x), (R_xlen_t) nrs * (R_xlen_t) ncs)


You should also note, that before determining the class of an object, R will always check that it is a first a vector, e.g. in the case of running is.matrix(x) on some matrix, x, R checks that it is first a vector, and then it checks for dimension attributes. If the dimension attributes is a vector of INTEGER data types of LENGTH 2 it satisfies the conditions for that object being a matrix (the following code snippet is from Rinlinedfuns.h from /src/include/)

INLINE_FUN Rboolean isMatrix(SEXP s)
  495 {
  496     SEXP t;
  497     if (isVector(s)) {
  498    t = getAttrib(s, R_DimSymbol);
  499    /* You are not supposed to be able to assign a non-integer dim,
  500       although this might be possible by misuse of ATTRIB. */
  501    if (TYPEOF(t) == INTSXP && LENGTH(t) == 2)
  502        return TRUE;
  503     }
  504     return FALSE;
  505 }

#  e.g. create an array with height and width....  
a <- array( 1:4 , dim=c(2,2) )

#  this is a matrix!
class(a)
#[1] "matrix"

# And the class of the first column is an atomic vector of type integer....
class(a[,1])
[1] "integer"

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  时光取名叫无心        
                
              
                            
                2020-12-16 03:33
              
            
            
                                                                       
Here's the best diagram I've found that lays out the class hierarchy used by the class function:



Although the class names don't correspond exactly with the results of the R class function, I believe the hierarchy is basically accurate. The key to your answer is that the class function only gives the root class in the hierarchy.

You will see that Vector is not a root class. The root class for your example would be StrVector, which corresponds to the "character" class, the class for a vector with character elements. In contrast, Matrix is itself a root class; hence, its class is "matrix".
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  闹比i        
                
              
                            
                2020-12-16 03:34
              
            
            
                                                                       
This is what I get from this. class is mainly meant for object oriented programming and there are other functions in R which will give you the storage mod of an object (see ?typeof or ?mode). 

When looking at ?class


  Many R objects have a class attribute, a character vector giving the
  names of the classes from which the object inherits. If the object
  does not have a class attribute, it has an implicit class, "matrix",
  "array" or the result of mode(x)


It seems like class works as follows


It first looks for a $class attribute
If there isn't any, it checks if the object has a matrix or an array structure by checking the $dim attribute (which is not present in a vector) 

2.1. if $dim contains two entries, it will call it a matrix

2.2. if $dim contains one entry or more than two entries, it will call it an array

2.3. if $dim is of length 0, it goes to the next step (mode)
if $dim is of length 0 and there is no $class attribute, it performs mode


So per your example

mat <- matrix(rep("la", 3), ncol=1)
vec <- rep("la", 3)
attributes(vec)
# NULL
attributes(mat)
## $dim
## [1] 3 1


So you can see that vec doesn't contain any attributes whatsoever (see ?c or ?as.vector for explanation) 

So in first case, class performs

attributes(vec)$class
# NULL
length(attributes(vec)$dim)
# 0
mode(vec)
## [1] "character"


In the second case it checks

attributes(mat)$class
# NULL
length(attributes(mat)$dim)
##[1] 2


It sees that the object has two dimensions and there for calls it matrix

In order to illustrate that both vec and mat have same storage mode, you can do

mode(vec)
## [1] "character"
mode(mat)
## [1] "character"


You can also see, for example, same behavior with an array 

ar <- array(rep("la", 3), c(3, 1)) # two dimensional array
class(ar)
##[1] "matrix"
ar <- array(rep("la", 3), c(3, 1, 1)) # three dimensional array
class(ar)
##[1] "array"


So both array and matrix don't parse a class attribute. Let's check, for example, what data.frame does.

df <- data.frame(A = rep("la", 3))
class(df)
## [1] "data.frame"


Where did class took it from?

attributes(df)    
# $names
# [1] "A"
# 
# $row.names
# [1] 1 2 3
# 
# $class
# [1] "data.frame"


As you can see, data.fram sets a $class attribute, but this could be changed

attributes(df)$class <- NULL
class(df)
## [1] "list"


Why list? Because data.frame don't have a $dim attribute (neither a $class one, because we just deleted it), thus class performs mode(df)

mode(df)
## [1] "list"


Lastly, in order to illustrate how class works, we can manually set the class to whatever we want and see what it will give us back

mat <- structure(mat, class = "vector")
vec <- structure(vec, class = "vector")
class(mat)
## [1] "vector"
class(vec)
## [1] "vector"

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  醉话见心        
                
              
                            
                2020-12-16 03:55
              
            
            
                                                                       
In the R language definition, there are six basic types of vector, one of which is "character". There really isn't a base "vector" type, but six different kinds of vectors that are all base types.

On the other hand, Matrix is a type of data structure.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复