问题
I have two datasets and I need to estimate correlation of the series in these two datasets. For example, one series in of length 189 and the other series is of length 192. The end point of these series respond to the same time period ,i.e., Dec 2015. The difference is in the start point of this series. I need to estimate the correlation for the blocks of 12 data points in both the series starting from the last point. For ex, the first block would be from Jan 2015 to Dec 2015, second block would be from Jan 2014 to Dec 2014. Since the last block would have unequal data length, the data length can be equalized and the last block can be of less than 12 months. For example in the example, the last block would be of length 9 months. How to create a loop and run this? I tried the following. This is giving me results but I am getting the same value of correlation for all the loop runs. DOn't know where am I going wrong.
correl=data.frame(x=numeric(0))
r=nrow(US)
s=nrow(Argentina)
a=ifelse(r<s,r,s)
for (i in 1:(a%/%12)) {
if(i<a%/%12){
elmnt1= US[r-11:r,]$IIP
elmnt2= Argentina[s-11:s,]$IIP
} else {
elmnt1= US[1:r%%12,]$IIP
elmnt2=Argentina[1:s%%12,]$IIP
}
corr=cor(elmnt1, elmnt2)
correl$x[i,]=corr
r=r-12
s=s-12
}
回答1:
You don't need to create a for loop solution for this. Not having matching observation lengths is a common problem that occurs in research, and there are answers built into the correlation functions to handle this. If you have two variables in the same data frame that are of different lengths, here are some options:
#Use cor.test(), which automatically matches lengths (i.e. excludes NAs):
cor.test(x,y)
#Or add the following argument to the cor() function for the same purpose:
cor(x,y,use='complete.obs')
As long as your x and y are in the same table, and are presumably matched by date in this case, then these options should solve the problem.
来源:https://stackoverflow.com/questions/46128303/run-correlation-between-unequal-sized-blocks-of-data-in-2-time-series