I started using data.table package in R to boost performance of my code. I am using the following code:
sp500 <- read.csv(\'../rawdata/GMTSP.csv\')
days &
Thanks for the suggestions. I solved it by writing the Gaussian algorithm for the dates myself and got far better results, see below.
getWeekDay <- function(year, month, day) {
# Implementation of the Gaussian algorithm to get weekday 0 - Sunday, ... , 7 - Saturday
Y <- year
Y[month<3] <- (Y[month<3] - 1)
d <- day
m <- ((month + 9)%%12) + 1
c <- floor(Y/100)
y <- Y-c*100
dayofweek <- (d + floor(2.6*m - 0.2) + y + floor(y/4) + floor(c/4) - 2*c) %% 7
return(dayofweek)
}
sp500 <- read.csv('../rawdata/GMTSP.csv')
days <- c("Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday")
# Using data.table to get the things much much faster
sp500 <- data.table(sp500, key="Date")
sp500 <- sp500[,Month:=as.integer(substr(Date,1,2))]
sp500 <- sp500[,Day:=as.integer(substr(Date,4,5))]
sp500 <- sp500[,Year:=as.integer(substr(Date,7,10))]
#sp500 <- sp500[,Date:=as.Date(Date, "%m/%d/%Y")]
#sp500 <- sp500[,Weekday:=factor(weekdays(sp500[,Date]), levels=days, ordered=T)]
sp500 <- sp500[,Weekday:=factor(getWeekDay(Year, Month, Day))]
levels(sp500$Weekday) <- days
Running the whole block above gives (including reading the date from csv)... Data.table is truly impressive.
user system elapsed
19.074 0.803 20.284
Timing of the conversion itself is 3.49 elapsed.