Rolling list over unequal times in XTS

Deadly 提交于 2019-12-08 02:51:58

问题


I have stock data at the tick level and would like to create a rolling list of all ticks for the previous 10 seconds. The code below works, but takes a very long time for large amounts of data. I'd like to vectorize this process or otherwise make it faster, but I'm not coming up with anything. Any suggestions or nudges in the right direction would be appreciated.

library(quantmod)
set.seed(150)

# Create five minutes of xts example data at .1 second intervals
mins  <- 5
ticks <- mins * 60 * 10 + 1


times <- xts(runif(seq_len(ticks),1,100), order.by=seq(as.POSIXct("1973-03-17 09:00:00"),
                                                       as.POSIXct("1973-03-17 09:05:00"), length = ticks))

# Randomly remove some ticks to create unequal intervals
times <- times[runif(seq_along(times))>.3]

# Number of seconds to look back
lookback  <- 10
dist.list <- list(rep(NA, nrow(times)))

system.time(
  for (i in 1:length(times)) {

    dist.list[[i]] <- times[paste(strptime(index(times[i])-(lookback-1), format = "%Y-%m-%d %H:%M:%S"), "/",
                                  strptime(index(times[i])-1, format = "%Y-%m-%d %H:%M:%S"), sep = "")]
  }
)
>  user  system elapsed 
   6.12    0.00    5.85 

回答1:


You should check out the window function, it will make your subselection of dates a lot easier. The following code uses lapply to do the work of the for loop.

# Your code
system.time(
  for (i in 1:length(times)) {

    dist.list[[i]] <- times[paste(strptime(index(times[i])-(lookback-1), format = "%Y-%m-%d %H:%M:%S"), "/",
                                  strptime(index(times[i])-1, format = "%Y-%m-%d %H:%M:%S"), sep = "")]
  }
  )

#    user  system elapsed 
#    10.09    0.00   10.11

# My code 
system.time(dist.list<-lapply(index(times),
    function(x) window(times,start=x-lookback-1,end=x))
)
#    user  system elapsed 
#    3.02    0.00    3.03 

So, about a third faster.

But, if you really want to speed things up, and you are willing to forgo millisecond accuracy (which I think your original method implicitly does), you could just run the loop on unique date-hour-second combinations, because they will all return the same time window. This should speed things up roughly twenty or thirty times:

dat.time=unique(as.POSIXct(as.character(index(times)))) # Cheesy method to drop the ms.
system.time(dist.list.2<-lapply(dat.time,function(x) window(times,start=x-lookback-1,end=x)))

# user  system elapsed 
# 0.37    0.00    0.39 


来源:https://stackoverflow.com/questions/10722512/rolling-list-over-unequal-times-in-xts

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!