Can anyone please tell me how to read only the first 6 months (7 columns) for each year of the data below, for example by using read.table()
?
Ye
To read a specific set of columns from a dataset you, there are several other options:
1) With fread
from the data.table
-package:
You can specify the desired columns with the select
parameter from fread
from the data.table
package. You can specify the columns with a vector of column names or column numbers.
For the example dataset:
library(data.table)
dat <- fread("data.txt", select = c("Year","Jan","Feb","Mar","Apr","May","Jun"))
dat <- fread("data.txt", select = c(1:7))
Alternatively, you can use the drop
parameter to indicate which columns should not be read:
dat <- fread("data.txt", drop = c("Jul","Aug","Sep","Oct","Nov","Dec"))
dat <- fread("data.txt", drop = c(8:13))
All result in:
> data
Year Jan Feb Mar Apr May Jun
1 2009 -41 -27 -25 -31 -31 -39
2 2010 -41 -27 -25 -31 -31 -39
3 2011 -21 -27 -2 -6 -10 -32
UPDATE: When you don't want fread
to return a data.table, use the data.table = FALSE
-parameter, e.g.: fread("data.txt", select = c(1:7), data.table = FALSE)
2) With read.csv.sql
from the sqldf
-package:
Another alternative is the read.csv.sql
function from the sqldf
package:
library(sqldf)
dat <- read.csv.sql("data.txt",
sql = "select Year,Jan,Feb,Mar,Apr,May,Jun from file",
sep = "\t")
3) With the read_*
-functions from the readr
-package:
library(readr)
dat <- read_table("data.txt",
col_types = cols_only(Year = 'i', Jan = 'i', Feb = 'i', Mar = 'i',
Apr = 'i', May = 'i', Jun = 'i'))
dat <- read_table("data.txt",
col_types = list(Jul = col_skip(), Aug = col_skip(), Sep = col_skip(),
Oct = col_skip(), Nov = col_skip(), Dec = col_skip()))
dat <- read_table("data.txt", col_types = 'iiiiiii______')
From the documentation an explanation for the used characters with col_types
:
each character represents one column: c = character, i = integer, n = number, d = double, l = logical, D = date, T = date time, t = time, ? = guess, or _/- to skip the column