I have several different txt files with the same structure. Now I want to read them into R using fread, and then union them into a bigger dataset.
## First
I've re-written the code to do this way too many times.. Finally rolled it into a handy function, below.
data.table_fread_mult <- function(filepaths = NULL, dir = NULL, recursive = FALSE, extension = NULL, ...){
# fread() multiple filepaths and then combine the results into a single data.table
# This function has two interfaces: either
# 1) provide `filepaths` as a character vector of filepaths to read or
# 2) provide `dir` (and optionally `extension` and `recursive`) to identify the directory to read from
# ... should be arguments to pass on to fread()
if(!is.null(filepaths) & (!is.null(dir) | !is.null(extension))){
stop("If `filepaths` is given, `dir` and `extension` should be NULL")
} else if(is.null(filepaths) & is.null(dir)){
stop("If `filepaths` is not given, `dir` should be given")
}
# If filepaths isn't given, build it from dir, recursive, extension
if(is.null(filepaths)){
filepaths <- list.files(
path = dir,
full.names = TRUE,
recursive = recursive,
pattern = paste0(extension, "$")
)
}
# Read and combine files
return(rbindlist(lapply(filepaths, fread, ...), use.names = TRUE))
}
Use rbindlist()
which is designed to rbind
a list
of data.table
's together...
mylist <- lapply(all.files, readdata)
mydata <- rbindlist( mylist )
And as @Roland says, do not set the key in each iteration of your function!
So in summary, this is best :
l <- lapply(all.files, fread, sep=",")
dt <- rbindlist( l )
setkey( dt , ID, date )