Calculate correlation by aggregating columns of data frame

邮差的信 提交于 2019-12-01 10:45:12

You could use apply

> apply(y[,-1],1,function(x) cor(x[1:2],x[3:4]))
[1] -1 -1  1 -1 1

Or ddply (although this might be overkill, and if two rows have the same group it will do the correlation of columns a&b and c&d for both those rows):

> ddply(y,.(group),function(x) cor(c(x$a,x$b),c(x$c,x$d)))
  group V1
1     a -1
2     b -1
3     c  1
4     d -1
5     e  1

You can use apply to apply a function to each row (or column) of a matrix, array or data.frame.

apply(
  y[,-1], # Remove the first column, to ensure that u remains numeric
  1,      # Apply the function on each row
  function(u) cor( u[1:2], u[3:4] )
)

(With just 2 observations, the correlation can only be +1 or -1.)

You're almost there: you just need to use apply instead of sapply, and remove unnecessary columns.

apply(y[-1], 1, function(x) cor(x[1:2], x[3:4])

Of course, the correlation between two length-2 vectors isn't very informative....

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!