I would like to calculate the correlation and the p value of that correlatio of each species (bac) to each of the factors (fac) in a second data frame. Both were measured at
We can use expand.grid
to get the combinations of rownames
of 'bac' and 'fac', loop through the rows with apply
specifying the MARGIN
as 1, subset the rows of 'bac' and 'fac' based on the rownames, do the corr.test
and extract the 'p' values as a list
library(psych)
do.call(c, apply(expand.grid(rownames(bac), rownames(fac)), 1,
function(x) list(corr.test(cbind(unlist(bac[1,]), unlist(fac[1,])))$p)))
You can just loop over the rows of expand.grid
pairs <- as.matrix(expand.grid(1:nrow(bac),1:nrow(fac)))
pairs <- cbind(pairs,NA,NA)
b <- as.matrix(bac)
f <- as.matrix(fac)
for(i in 1:nrow(pairs)){
pairs[i,3] <- cor(b[pairs[i,1],], f[pairs[i,2],])
pairs[i,4] <- cor.test(b[pairs[i,1],], f[pairs[i,2],])$p.value
}
colnames(pairs) <- c('bac','fac','corr','p')
pairs
## bac fac corr p
## [1,] 1 1 0.98994949 0.01005051
## [2,] 2 1 -0.07559289 0.92440711
## [3,] 3 1 -0.60000000 0.40000000
## [4,] 4 1 -0.60000000 0.40000000
## [5,] 5 1 -0.07559289 0.92440711
## [6,] 1 2 0.98994949 0.01005051
If you want the names you can then do
pairs <- as.data.frame(pairs)
pairs[,1] <- sapply(pairs[,1],function(x) rownames(bac)[x])
pairs[,2] <- sapply(pairs[,2],function(x) rownames(fac)[x])
although at that point it's probably easier to use 李哲源 Zheyuan Li 's solution.
If you restructure your data, such that you compute correlation between paired columns, it would be super easy.
tbac <- data.frame(t(bac))
tfac <- data.frame(t(fac))
f <- function (x, y) cor(x, y)
tab <- outer(tfac, tbac, Vectorize(f))
as.data.frame.table(tab)
I had an answer using the same idea: Match data and count number of same value.
You can just pass the full matrices to the cor
function (or psych::corr.test
)and it takes care of finding the correlation of the relevant columns.
For example
cor(t(fac), t(bac))
# bac1 bac2 bac3 bac4 bac5
# fac1 0.9899495 -0.07559289 -0.60000000 -0.60000000 -0.07559289
# fac2 0.9899495 -0.07559289 -0.60000000 -0.60000000 -0.07559289
# fac3 -0.3207135 0.94285714 -0.07559289 -0.07559289 -0.48571429
# fac4 -0.8000000 -0.32071349 0.98994949 0.98994949 -0.32071349
# fac5 -0.3207135 -0.48571429 -0.07559289 -0.07559289 0.94285714
# fac6 NA NA NA NA NA
You can then turn this in to long format using reshape2::melt
reshape2::melt(cor(t(fac), t(bac)))
# Var1 Var2 value
# 1 fac1 bac1 0.98994949
# 2 fac2 bac1 0.98994949
# 3 fac3 bac1 -0.32071349
# 4 fac4 bac1 -0.80000000
# ---
# ---
To get the p-values use the same approach
test <- psych::corr.test(t(fac), t(bac), adjust="none")
And melt as before and join
merge(melt(test$r, value.name="cor"), melt(test$p, value.name="p-value"), by=c("Var1", "Var2"))
# Var1 Var2 cor p-value
# 1 fac1 bac1 0.98994949 0.01005051
# 2 fac1 bac2 -0.07559289 0.92440711
# 3 fac1 bac3 -0.60000000 0.40000000
# 4 fac1 bac4 -0.60000000 0.40000000
# 5 fac1 bac5 -0.07559289 0.92440711
# 6 fac2 bac1 0.98994949 0.01005051