Only read selected columns

前端 未结 4 1680
春和景丽
春和景丽 2020-11-22 04:23

Can anyone please tell me how to read only the first 6 months (7 columns) for each year of the data below, for example by using read.table()?

Ye         


        
4条回答
  •  萌比男神i
    2020-11-22 04:41

    To read a specific set of columns from a dataset you, there are several other options:

    1) With freadfrom the data.table-package:

    You can specify the desired columns with the select parameter from fread from the data.table package. You can specify the columns with a vector of column names or column numbers.

    For the example dataset:

    library(data.table)
    dat <- fread("data.txt", select = c("Year","Jan","Feb","Mar","Apr","May","Jun"))
    dat <- fread("data.txt", select = c(1:7))
    

    Alternatively, you can use the drop parameter to indicate which columns should not be read:

    dat <- fread("data.txt", drop = c("Jul","Aug","Sep","Oct","Nov","Dec"))
    dat <- fread("data.txt", drop = c(8:13))
    

    All result in:

    > data
      Year Jan Feb Mar Apr May Jun
    1 2009 -41 -27 -25 -31 -31 -39
    2 2010 -41 -27 -25 -31 -31 -39
    3 2011 -21 -27  -2  -6 -10 -32
    

    UPDATE: When you don't want fread to return a data.table, use the data.table = FALSE-parameter, e.g.: fread("data.txt", select = c(1:7), data.table = FALSE)

    2) With read.csv.sql from the sqldf-package:

    Another alternative is the read.csv.sql function from the sqldf package:

    library(sqldf)
    dat <- read.csv.sql("data.txt",
                        sql = "select Year,Jan,Feb,Mar,Apr,May,Jun from file",
                        sep = "\t")
    

    3) With the read_*-functions from the readr-package:

    library(readr)
    dat <- read_table("data.txt",
                      col_types = cols_only(Year = 'i', Jan = 'i', Feb = 'i', Mar = 'i',
                                            Apr = 'i', May = 'i', Jun = 'i'))
    dat <- read_table("data.txt",
                      col_types = list(Jul = col_skip(), Aug = col_skip(), Sep = col_skip(),
                                       Oct = col_skip(), Nov = col_skip(), Dec = col_skip()))
    dat <- read_table("data.txt", col_types = 'iiiiiii______')
    

    From the documentation an explanation for the used characters with col_types:

    each character represents one column: c = character, i = integer, n = number, d = double, l = logical, D = date, T = date time, t = time, ? = guess, or _/- to skip the column

提交回复
热议问题