问题
I calculated a cross-correlation of two time series using ccf() in R. I know how to derive the confidence limits as:
ccf1 <- ccf(x=x,y=y,lag.max=5,na.action=na.pass, plot=F)
upperCI <- qnorm((1+0.95)/2)/sqrt(ccf1$n.used)
lowerCI <- -qnorm((1+0.95)/2)/sqrt(ccf1$n.used)
But what I really need is the p-value of the maximum correlation.
ind.max <- which(abs(ccf1$acf[1:11])==max(abs(ccf1$acf[1:11])))
max.cor <- ccf1$acf[ind.max]
lag.opt <- ccf1$lag[ind.max]
How do I calculate this p-value? I have searched high and low but can't find a good answer anywhere.
回答1:
Getting p-value is straightforward.
Under Null Hypothesis that the correlation is 0, it is normally distributed:
Z ~ N(0, 1/sqrt(ccf1$n.used))
So for your observed maximum correlation max.cor, its p-value is just the probability Pr(Z > |max.cor|), which can be computed by:
2 * (1 - pnorm(abs(max.cor), mean = 0, sd = 1/sqrt(ccf1$n.used)))
Follow-up
Is it really that simple? The
ccfis computing many correlations at once!
Are you saying that ccf is computing correlations at different lags? Well, provided you have large number of observations N, the standard deviation of ACF at each lag is the same: 1/sqrt(N). That is why the confidence interval are two horizontal lines.
来源:https://stackoverflow.com/questions/38173544/how-to-calculate-p-values-from-cross-correlation-function-in-r