问题
I want to apply a function to 20 trading days worth of hourly FX data (as one example amongst many).
I started off with rollapply(data,width=20*24,FUN=FUN,by=24)
. That seemed to be working well, I could even assert I always got 480 bars passed in... until I realized that wasn't what I wanted. The start and end time of those 480 bars was drifting over the years, due to changes in daylight savings, and market holidays.
So, what I want is a function that treats a day as from 22:00 to 22:00 of each day we have data for. (21:00 to 21:00 in N.Y. summertime - my data timezone is UTC, and daystart is defined at 5pm ET)
So, I made my own rollapply function with this at its core:
ep=endpoints(data,on=on,k=k)
sp=ep[1:(length(ep)-width)]+1
ep=ep[(width+1):length(ep)]
xx <- lapply(1:length(ep), function(ix) FUN(.subset_xts(data,sp[ix]:ep[ix]),...) )
I then called this with on="days", k=1 and width=20.
This has two problems:
- Days is in days, not trading days! So, instead of typically 4 weeks of data, I get just under 3 weeks of data.
- The cutoff is midnight UTC. I cannot work out how to change it to use the 22:00 (or 21:00) cutoff.
UPDATE: Problem 1 above is wrong! The XTS
endpoints
function does work in trading days, not calendar days. The reason I thought otherwise is the timezone issue made it look like a 6-day trading week: Sun to Fri. Once the timezone problem was fixed (see my self-answer), usingwidth=20
andon="days"
does indeed give me 4 weeks of data.
(The typically there is important: when there is a trading holiday during those 4 weeks I expect to receive 4 weeks 1 day's worth of data, i.e. always exactly 20 trading days.)
I started working on a function to cut the data into weeks, thinking I could then cut them into five 24hr chunks, but this feels like the wrong approach, and surely someone has invented this wheel before me?
回答1:
Here is how to get the daybreak right:
x2=x
index(x2)=index(x2)+(7*3600)
indexTZ(x2)='America/New_York'
I.e. just setting the timezone puts the daybreak at 17:00; we want it to be at 24:00, so add 7 hours on first.
With help from: time zones in POSIXct and xts, converting from GMT in R
Here is the full function:
rollapply_chunks.FX.xts=function(data,width,FUN,...,on="days",k=1){
data <- try.xts(data)
x2 <- data
index(x2) <- index(x2)+(7*3600)
indexTZ(x2) <- 'America/New_York'
ep <- endpoints(x2,on=on,k=k) #The end point of each calendar day (when on="days").
#Each entry points to the final bar of the day. ep[1]==0.
if(length(ep)<2){
stop("Cannot divide data up")
}else if(length(ep)==2){ #Can only fit one chunk in.
sp <- 1;ep <- ep[-1]
}else{
sp <- ep[1:(length(ep)-width)]+1
ep <- ep[(width+1):length(ep)]
}
xx <- lapply(1:length(ep), function(ix) FUN(.subset_xts(data,sp[ix]:ep[ix]),...) )
xx <- do.call(rbind,xx) #Join them up as one big matrix/data.frame.
tt <- index(data)[ep] #Implicit align="right". Use sp for align="left"
res <- xts(xx, tt)
return (res)
}
You can see we use the modified index to split up the original data. (If R uses copy-on-write under the covers, then the only extra memory requirement should be for a copy of the index, not of the data.)
(Legal bit: please consider it licensed under MIT, but explicit permission given to use in the GPL-2 XTS package if that is desired.)
来源:https://stackoverflow.com/questions/12087092/xts-split-fx-intraday-bar-data-by-trading-days