Correlation between two dataframes by row

瘦欲@ 提交于 2021-02-04 10:22:13

问题


I have 2 data frames w/ 5 columns and 100 rows each.

id       price1      price2     price3     price4     price5
 1         11.22      25.33      66.47      53.76      77.42
 2         33.56      33.77      44.77      34.55      57.42
...

I would like to get the correlation of the corresponding rows, basically

for(i in 1:100){    
cor(df1[i, 1:5], df2[i, 1:5])    
}

but without using a for-loop. I'm assuming there's someway to use plyr to do it but can't seem to get it right. Any suggestions?


回答1:


Depending on whether you want a cool or fast solution you can use either

diag(cor(t(df1), t(df2)))

which is cool but wasteful (because it actually computes correlations between all rows which you don't really need so they will be discarded) or

A <- as.matrix(df1)
B <- as.matrix(df2)
sapply(seq.int(dim(A)[1]), function(i) cor(A[i,], B[i,]))

which does only what you want but is a bit more to type.




回答2:


I found that as.matrix is not required.

Correlations of all pairs of rows between dataframes df1 and df2:

sapply(1:nrow(df1), function(i) cor(df1[i,], df2[i,]))

and columns:

sapply(1:ncol(df1), function(i) cor(df1[,i], df2[,i]))


来源:https://stackoverflow.com/questions/9136116/correlation-between-two-dataframes-by-row

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!