cor shows only NA or 1 for correlations - Why?

独自空忆成欢 提交于 2019-11-28 06:46:48

The 1s are because everything is perfectly correlated with itself, and the NAs are because there are NAs in your variables.

You will have to specify how you want R to compute the correlation when there are missing values, because the default is to only compute a coefficient with complete information.

You can change this behavior with the use argument to cor, see ?cor for details.

Tell the correlation to ignore the NAs with use argument, e.g.:

cor(data$price, data$exprice, use = "complete.obs")

NAs also appear if there are attributes with zero variance (with all elements equal); see for instance:

cor(cbind(a=runif(10),b=rep(1,10)))

which returns:

   a  b
a  1 NA
b NA  1
Warning message:
In cor(cbind(a = runif(10), b = rep(1, 10))) :
  the standard deviation is zero
Patrick Koua

very simple and correct answer

Tell the correlation to ignore the NAs with use argument, e.g.:

cor(data$price, data$exprice, use = "complete.obs")

The NA can actually be due to 2 reasons. One is that there is a NA in your data. Another one is due to there being one of the values being constant. This results in standard deviation being equal to zero and hence the cor function returns NA.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!