Extracting and formatting results of cor.test on multiple pairs of columns

匿名 (未验证) 提交于 2019-12-03 09:14:57

问题:

I am trying to generate a table output of a correlation matrix. Specifically, I am using a for loop in order to identify a correlation between all data in columns 4:40 to column 1. While the results of the table are decent, it does not identify what is being compared to what. In checking attributes of cor.test,I find that data.name is being given as x[1] and y[1] which is not good enough to trace back which columns is being compared to what. Here is my code:

input <- read.delim(file="InputData.txt", header=TRUE) x<-input[,41, drop=FALSE] y=input[,4:40] corr.values <- vector("list", 37) for (i in 1:length(y) ){   corr.values[[i]] <- cor.test(x[[1]], y[[i]], method="pearson") } lres <- sapply(corr.values, `[`, c("statistic","p.value","estimate","method", "data.name")) lres<-t(lres) write.table(lres, file="output.xls", sep="\t",row.names=TRUE)

The output file looks like this:

       statistic        p.value     estimate                                  method            data.name    1   -2.030111981    0.042938137 -0.095687495    Pearson's product-moment correlation    x[[1]] and y[[i]] 2   -2.795786248    0.005400938 -0.131239287    Pearson's product-moment correlation    x[[1]] and y[[i]] 3   -2.099114632    0.036368337 -0.098908573    Pearson's product-moment correlation    x[[1]] and y[[i]] 4   -1.920649487    0.055413178 -0.090571599    Pearson's product-moment correlation    x[[1]] and y[[i]] 5   -1.981326962    0.048168291 -0.093408365    Pearson's product-moment correlation    x[[1]] and y[[i]] 6   -2.80390736      0.00526909 -0.131613912    Pearson's product-moment correlation    x[[1]] and y[[i]] 7   -1.265138839    0.206482153 -0.059798855    Pearson's product-moment correlation    x[[1]] and y[[i]] 8   -2.861448156    0.004415411 -0.134266636    Pearson's product-moment correlation    x[[1]] and y[[i]] 9   -2.103403363    0.035990039 -0.099108672    Pearson's product-moment correlation    x[[1]] and y[[i]] 10  -3.610094985    0.000340807 -0.168498786    Pearson's product-moment correlation    x[[1]] and y[[i]]

Clearly, this is not perfect as rows are numbered and can't tell which correlation is to what. Is there a way to fix this? I tried many solutions but none worked.I know that the trick must be in editing the data.name attribute however I couldn't figure out how to do that.

回答1:

Here's a way to return a data frame with all the cor.test results that also includes the names of the variables for which each correlation was calculated: We create a function to extract the relevant results of cor.test then use mapply to apply the function to each pair of variables for which we want the correlations. mapply returns a list, so we use do.call(rbind, ...) to turn it into a data frame.

# Function to extract correlation coefficient and p-values corrFunc <- function(var1, var2, data) {   result = cor.test(data[,var1], data[,var2])   data.frame(var1, var2, result[c("estimate","p.value","statistic","method")],               stringsAsFactors=FALSE) }  ## Pairs of variables for which we want correlations vars = data.frame(v1=names(mtcars)[1], v2=names(mtcars)[-1])  # Apply corrFunc to all rows of vars corrs = do.call(rbind, mapply(corrFunc, vars[,1], vars[,2], MoreArgs=list(data=mtcars),                                SIMPLIFY=FALSE))       var1 var2   estimate      p.value statistic                               method cor   mpg  cyl -0.8475514 9.380327e-10 -8.747152 Pearson's product-moment correlation cor1  mpg disp -0.7761684 1.787835e-07 -6.742389 Pearson's product-moment correlation cor2  mpg   hp  0.4186840 1.708199e-02  2.525213 Pearson's product-moment correlation cor3  mpg drat  0.6811719 1.776240e-05  5.096042 Pearson's product-moment correlation cor4  mpg   wt  0.4802848 5.400948e-03  2.999191 Pearson's product-moment correlation cor5  mpg qsec  0.6640389 3.415937e-05  4.864385 Pearson's product-moment correlation cor6  mpg   vs  0.5998324 2.850207e-04  4.106127 Pearson's product-moment correlation cor7  mpg   am  1.0000000 0.000000e+00       Inf Pearson's product-moment correlation cor8  mpg gear -0.8676594 1.293959e-10 -9.559044 Pearson's product-moment correlation cor9  mpg carb -0.8521620 6.112687e-10 -8.919699 Pearson's product-moment correlation


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!