I have a .csv file: example.csv with 8000 columns x 40000 rows. The csv file have a string header for each column. All fields contains integer values between 0 and 10. When
If you'll read the file often, it might well be worth saving it from R in a binary format using the save
function. Specifying compress=FALSE
often results in faster load times.
...You can then load it in with the (surprise!) load
function.
d <- as.data.frame(matrix(1:1e6,ncol=1000))
write.csv(d, "c:/foo.csv", row.names=FALSE)
# Load file with read.csv
system.time( a <- read.csv("c:/foo.csv") ) # 3.18 sec
# Load file using scan
system.time( b <- matrix(scan("c:/foo.csv", 0L, skip=1, sep=','),
ncol=1000, byrow=TRUE) ) # 0.55 sec
# Load (binary) file using load
save(d, file="c:/foo.bin", compress=FALSE)
system.time( load("c:/foo.bin") ) # 0.09 sec