Correlation/p values of all combinations of all rows of two matrices

后端 未结 4 1236
旧时难觅i
旧时难觅i 2020-12-20 04:20

I would like to calculate the correlation and the p value of that correlatio of each species (bac) to each of the factors (fac) in a second data frame. Both were measured at

相关标签:
4条回答
  • 2020-12-20 04:46

    We can use expand.grid to get the combinations of rownames of 'bac' and 'fac', loop through the rows with apply specifying the MARGIN as 1, subset the rows of 'bac' and 'fac' based on the rownames, do the corr.test and extract the 'p' values as a list

    library(psych)
    do.call(c, apply(expand.grid(rownames(bac), rownames(fac)), 1, 
      function(x) list(corr.test(cbind(unlist(bac[1,]), unlist(fac[1,])))$p)))
    
    0 讨论(0)
  • 2020-12-20 04:54

    You can just loop over the rows of expand.grid

    pairs <- as.matrix(expand.grid(1:nrow(bac),1:nrow(fac)))
    pairs <- cbind(pairs,NA,NA)
    b <- as.matrix(bac)
    f <- as.matrix(fac)
    for(i in 1:nrow(pairs)){
        pairs[i,3] <- cor(b[pairs[i,1],], f[pairs[i,2],])
        pairs[i,4] <- cor.test(b[pairs[i,1],], f[pairs[i,2],])$p.value
    }
    colnames(pairs) <- c('bac','fac','corr','p')
    pairs
    ##      bac fac        corr          p
    ## [1,]   1   1  0.98994949 0.01005051
    ## [2,]   2   1 -0.07559289 0.92440711
    ## [3,]   3   1 -0.60000000 0.40000000
    ## [4,]   4   1 -0.60000000 0.40000000
    ## [5,]   5   1 -0.07559289 0.92440711
    ## [6,]   1   2  0.98994949 0.01005051
    

    If you want the names you can then do

    pairs <- as.data.frame(pairs)
    pairs[,1] <- sapply(pairs[,1],function(x) rownames(bac)[x])
    pairs[,2] <- sapply(pairs[,2],function(x) rownames(fac)[x])
    

    although at that point it's probably easier to use 李哲源 Zheyuan Li 's solution.

    0 讨论(0)
  • 2020-12-20 05:07

    If you restructure your data, such that you compute correlation between paired columns, it would be super easy.

    tbac <- data.frame(t(bac))
    tfac <- data.frame(t(fac))
    
    f <- function (x, y) cor(x, y)
    
    tab <- outer(tfac, tbac, Vectorize(f))
    
    as.data.frame.table(tab)
    

    I had an answer using the same idea: Match data and count number of same value.

    0 讨论(0)
  • 2020-12-20 05:08

    You can just pass the full matrices to the cor function (or psych::corr.test)and it takes care of finding the correlation of the relevant columns.

    For example

    cor(t(fac), t(bac))
    #            bac1        bac2        bac3        bac4        bac5
    # fac1  0.9899495 -0.07559289 -0.60000000 -0.60000000 -0.07559289
    # fac2  0.9899495 -0.07559289 -0.60000000 -0.60000000 -0.07559289
    # fac3 -0.3207135  0.94285714 -0.07559289 -0.07559289 -0.48571429
    # fac4 -0.8000000 -0.32071349  0.98994949  0.98994949 -0.32071349
    # fac5 -0.3207135 -0.48571429 -0.07559289 -0.07559289  0.94285714
    # fac6         NA          NA          NA          NA          NA
    

    You can then turn this in to long format using reshape2::melt

    reshape2::melt(cor(t(fac), t(bac)))
    #    Var1 Var2       value
    # 1  fac1 bac1  0.98994949
    # 2  fac2 bac1  0.98994949
    # 3  fac3 bac1 -0.32071349
    # 4  fac4 bac1 -0.80000000
    # ---
    # ---
    

    To get the p-values use the same approach

    test <- psych::corr.test(t(fac), t(bac), adjust="none")
    

    And melt as before and join

    merge(melt(test$r, value.name="cor"), melt(test$p, value.name="p-value"), by=c("Var1", "Var2"))
    #   Var1 Var2         cor    p-value
    # 1 fac1 bac1  0.98994949 0.01005051
    # 2 fac1 bac2 -0.07559289 0.92440711
    # 3 fac1 bac3 -0.60000000 0.40000000
    # 4 fac1 bac4 -0.60000000 0.40000000
    # 5 fac1 bac5 -0.07559289 0.92440711
    # 6 fac2 bac1  0.98994949 0.01005051
    
    0 讨论(0)
提交回复
热议问题