Rolling window over irregular time series

|▌冷眼眸甩不掉的悲伤 提交于 2019-11-28 05:33:19

Here's a solution using xts:

x <- structure(c(867L, 891L, 901L, 991L, 1085L, 1114L, 1117L, 1151L, 
  1174L, 1268L, 1384L, 1403L, 1550L, 1596L, 1608L), .Dim = c(15L, 1L),
  index = structure(c(1280960887, 1280964672, 1280966285, 
  1280997350, 1281014882, 1281017687, 1281018106, 1281023184, 1281025529, 
  1281050369, 1281096942, 1281108126, 1281176749, 1281207496, 1281215744),
  tzone = "", tclass = c("POSIXct", "POSIXt")), class = c("xts", "zoo"),
  .indexCLASS = c("POSIXct", "POSIXt"), tclass = c("POSIXct", "POSIXt"),
  .indexTZ = "", tzone = "")
# first count the number of observations each day
xd <- apply.daily(x, length)
# now sum the counts over a 2-day rolling window
x2d <- rollapply(xd, 2, sum)
# align times at the end of the period (if you want)
y <- align.time(x2d, n=60*60*24)  # n is in seconds

This seems to work:

# n = number of days
n <- 30
# w = window width. In this example, w = 7 days
w <- 7

# I will simulate some data to illustrate the procedure
data <- rep(1:n, rpois(n, 2))

# Tabulate the number of occurences per day:
# (use factor() to be sure to have the days with zero observations included)
date.table <- table(factor(data, levels=1:n))  

mat <- diag(n)
for (i in 2:w){
  dim <- n+i-1
  mat <- mat + diag(dim)[-((n+1):dim),-(1:(i-1))]
  }

# And the answer is.... 
roll.mean.7days <- date.table %*% mat

Seems to be not too slow (although the mat matrix will get dimensions n*n). I tried to replace n=30 with n=3000 (which creates a matrix of 9 million elements = 72 MB) and it still was reasonable fast on my computer. For very big data sets, try on a subset first.... It will also be faster to use some of the functions in the Matrix package (bandSparse) to create the mat matrix.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!