问题
I have two matrices of the same size. I would like to calculate the correlation coefficient between each pair of rows in these matrices; row 1 from A with row 1 B, row 2 from A with row 2 from B etc.
A <- matrix(runif(1:200), nrow=20)
B <- matrix(runif(1:200), nrow=20)
Best I could come up with is
ret <- sapply(1:20, function(i) cor(A[i,], B[i,]))
but it is terribly inefficient (the matrices have tens of thousands of rows). Is there a better, faster way?
回答1:
This should be fast:
cA <- A - rowMeans(A)
cB <- B - rowMeans(B)
sA <- sqrt(rowMeans(cA^2))
sB <- sqrt(rowMeans(cB^2))
rowMeans(cA * cB) / (sA * sB)
回答2:
You could create vectorized functions that will calculate covariance and SD for you such as,
RowSD <- function(x) {
sqrt(rowSums((x - rowMeans(x))^2)/(dim(x)[2] - 1))
}
VecCov <- function(x, y){
rowSums((x - rowMeans(x))*(y - rowMeans(y)))/(dim(x)[2] - 1)
}
Then, simply do
VecCov(A, B)/(RowSD(A) * RowSD(B))
来源:https://stackoverflow.com/questions/27943070/row-wise-correlations-in-r