问题
I have 2 time series and I am using ccf
to find the cross correlation between them.
ccf(ts1, ts2)
lists the cross-correlations for all time lags. How can I find the lag which results in maximum correlation without manually looking at the data?
回答1:
Posting the answer http://r.789695.n4.nabble.com/ccf-function-td2288257.html
Find_Max_CCF<- function(a,b)
{
d <- ccf(a, b, plot = FALSE)
cor = d$acf[,,1]
lag = d$lag[,,1]
res = data.frame(cor,lag)
res_max = res[which.max(res$cor),]
return(res_max)
}
回答2:
I thought I'd redo the above function but have it find the absolute max correlation that returns the original correlation (positive or negative). I also maxed out (nearly) the number of lags.
Find_Abs_Max_CCF<- function(a,b)
{
d <- ccf(a, b, plot = FALSE, lag.max = length(a)-5)
cor = d$acf[,,1]
abscor = abs(d$acf[,,1])
lag = d$lag[,,1]
res = data.frame(cor,lag)
absres = data.frame(abscor,lag)
absres_max = res[which.max(absres$abscor),]
return(absres_max)
}
回答3:
Because 3 is more than 4, I also had a stab at modifying this function, this time by implementing an idea from here:
ccfmax <- function(a, b, e=0)
{
d <- ccf(a, b, plot = FALSE, lag.max = length(a)/2)
cor = d$acf[,,1]
abscor = abs(d$acf[,,1])
lag = d$lag[,,1]
res = data.frame(cor, lag)
absres = data.frame(abscor, lag)
maxcor = max(absres$abscor)
absres_max = res[which(absres$abscor >= maxcor-maxcor*e &
absres$abscor <= maxcor+maxcor*e),]
return(absres_max)
}
Essentially an "error" term is added, so that if there are several values close to the maximum, they all get returned, eg:
ayy <- jitter(cos((1:360)/5), 100)
bee <- jitter(sin((1:360)/5), 100)
ccfmax(ayy, bee, 0.02)
cor lag
348 0.9778319 -8
349 0.9670333 -7
363 -0.9650827 7
364 -0.9763180 8
If no value for e
is given it is taken to be zero, and the function behaves just like the one nvogen posted.
回答4:
I've modified the original solution as well, in order to loop over the function and output the values corresponding to a character vector of indices (x):
abs.max.ccf <- function(x,a,b) {
d <- ccf(a, b, plot=FALSE, lag.max=length(a)-5)
cor <- d$acf[,,1]
abscor <- abs(d$acf[,,1])
lag <- d$lag[,,1]
abs.cor.max <- abscor[which.max(abscor)]
abs.cor.max.lag <- lag[which.max(abscor)]
return(c(x, abs.cor.max, abs.cor.max.lag))
}
I removed the data.frame
part within the function, as it is unnecessarily slow. To loop over each column in a data.frame
and return the results to a new data.frame
, I use this method:
max.ccf <- lapply(colnames(df), function(x) unlist(abs.max.ccf(x, df$y, df[x])))
max.ccf <- data.frame(do.call(rbind, max.ccf))
colnames(max.ccf) <- c('Index','Cor','Lag')
来源:https://stackoverflow.com/questions/10369109/finding-lag-at-which-cross-correlation-is-maximum-ccf