Only read selected columns

前端未结

关注

 4  1680

春和景丽 2020-11-22 04:23

Can anyone please tell me how to read only the first 6 months (7 columns) for each year of the data below, for example by using read.table()?

Ye


      
      
        
          4条回答        

        
                    
            
            
                         
                
              
              
                
                   萌比男神i
                                             
                
                
                (楼主)
            
              
              
                2020-11-22 04:41
              

            
            
                        
To read a specific set of columns from a dataset you, there are several other options:

1) With freadfrom the data.table-package:

You can specify the desired columns with the select parameter from fread from the data.table package. You can specify the columns with a vector of column names or column numbers.

For the example dataset:

library(data.table)
dat <- fread("data.txt", select = c("Year","Jan","Feb","Mar","Apr","May","Jun"))
dat <- fread("data.txt", select = c(1:7))


Alternatively, you can use the drop parameter to indicate which columns should not be read:

dat <- fread("data.txt", drop = c("Jul","Aug","Sep","Oct","Nov","Dec"))
dat <- fread("data.txt", drop = c(8:13))


All result in:

> data
  Year Jan Feb Mar Apr May Jun
1 2009 -41 -27 -25 -31 -31 -39
2 2010 -41 -27 -25 -31 -31 -39
3 2011 -21 -27  -2  -6 -10 -32


UPDATE: When you don't want fread to return a data.table, use the data.table = FALSE-parameter, e.g.: fread("data.txt", select = c(1:7), data.table = FALSE) 

2) With read.csv.sql from the sqldf-package:

Another alternative is the read.csv.sql function from the sqldf package:

library(sqldf)
dat <- read.csv.sql("data.txt",
                    sql = "select Year,Jan,Feb,Mar,Apr,May,Jun from file",
                    sep = "\t")


3) With the read_*-functions from the readr-package:

library(readr)
dat <- read_table("data.txt",
                  col_types = cols_only(Year = 'i', Jan = 'i', Feb = 'i', Mar = 'i',
                                        Apr = 'i', May = 'i', Jun = 'i'))
dat <- read_table("data.txt",
                  col_types = list(Jul = col_skip(), Aug = col_skip(), Sep = col_skip(),
                                   Oct = col_skip(), Nov = col_skip(), Dec = col_skip()))
dat <- read_table("data.txt", col_types = 'iiiiiii______')


From the documentation an explanation for the used characters with col_types:


  each character represents one column: c = character, i = integer, n = number, d = double, l = logical, D = date, T = date time, t = time, ? = guess, or _/- to skip the column

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它4个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复