This has been asked several times with no clear answer: I would like to convert an R character string of the form \"YYYY-mm-dd\" into a Date
. The as.Date<
A further speedup: You already work with data.table. So, create a lookup table with your dates and merge them with your data.
library(lubridate)
library(data.table)
y <- seq(as.Date('1900-01-01'), Sys.Date(), by = 'day')
id.date <- data.table(id = as.character(y), date = as.Date(y), key = 'id')
set.seed(21)
x <- as.character(Sys.Date()-sample(40000, 1e6, TRUE))
system.time(date3 <- as.Date(parse_date_time(x,'%y-%m-%d'))) # from package 'lubridate'
# user system elapsed
# 0.15 0.00 0.15
system.time(date4 <- id.date[setDT(list(id = x)), on='id', date])
# user system elapsed
# 0.08 0.00 0.08
all(date3 == date4)
# TRUE
It's kind of a workaround, but I believe thats how data.table is intended to use. I don't know if the above mentioned time/date packages internally are based on algorithms or as well on lookup tables (hash tables).
For larger datasets, whenever there is character manipulation involved, which tend to be slow, I consider switching to lookup a reference table.