Is there a way for fread to mimic the behaviour of read.table whereby the class of the variable is set by the data that is read in. >
Option 1: Using a system command
fread() allows the use of a system command in its first argument. We can use it to remove the quotes in the first column of the file.
indt <- data.table::fread("cat test.csv | tr -d '\"'", nrows = 100)
str(indt)
# Classes ‘data.table’ and 'data.frame': 100 obs. of 2 variables:
# $ x: int 1 2 3 4 5 6 7 8 9 10 ...
# $ y: int 1 2 3 4 5 6 7 8 9 10 ...
# - attr(*, ".internal.selfref")=
The system command cat test.csv | tr -d '\"' explained:
cat test.csv reads the file to standard output| is a pipe, using the output of the previous command as input for the next command tr -d '\"' deletes (-d) all occurrences of double quotes ('\"') from the current inputOption 2: Coercion after reading
Since option 1 doesn't seem to be working on your system, another possibility is to read the file as you did, but convert the x column with type.convert().
library(data.table)
indt2 <- fread("test.csv", nrows = 100)[, x := type.convert(x)]
str(indt2)
# Classes ‘data.table’ and 'data.frame': 100 obs. of 2 variables:
# $ x: int 1 2 3 4 5 6 7 8 9 10 ...
# $ y: int 1 2 3 4 5 6 7 8 9 10 ...
# - attr(*, ".internal.selfref")=
Side note: I usually prefer to use type.convert() over as.numeric() to avoid the "NAs introduced by coercion" warning triggered in some cases. For example,
x <- c("1", "4", "NA", "6")
as.numeric(x)
# [1] 1 4 NA 6
# Warning message:
# NAs introduced by coercion
type.convert(x)
# [1] 1 4 NA 6
But of course you can use as.numeric() as well.
Note: This answer assumes data.table dev v1.9.5