R: Why does mean(NA, na.rm = TRUE) return NaN

我怕爱的太早我们不能终老 提交于 2019-12-07 20:09:54

问题


When estimating the mean with a vector of all NA's we get an NaN if na.rm = TRUE. Why is this, is this flawed logic or is there something I'm missing? Surely it would make more sense to use NA than NaN?

Quick example below

mean(NA, na.rm = TRUE)
#[1] NaN

mean(rep(NA, 10), na.rm = TRUE)
#[1] NaN

回答1:


It is a bit pity that ?mean does not say anything about this. My comment only told you that applying mean on an empty "numeric" results in NaN without more reasoning. Rui Barradas's comment tried to reason this but was not accurate, as division by 0 is not always NaN, it can be Inf or -Inf. I once discussed about this in R: element-wise matrix division. However, we are getting close. Although mean(x) is not coded by sum(x) / length(x), this mathematical fact really explains this NaN.

From ?sum:

 *NB:* the sum of an empty set is zero, by definition.

So sum(numeric(0)) is 0. As length(numeric(0)) is 0, mean(numeric(0)) is 0 / 0 which is NaN.




回答2:


From mean documentation :

na.rm a logical value indicating whether NA values should be stripped before the computation proceeds.

With this logic all NAs are removed before the function mean is applied. In your cases you are applying mean to nothing (all NAs are removed) so NaN is returned.



来源:https://stackoverflow.com/questions/51503869/r-why-does-meanna-na-rm-true-return-nan

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!