Why does 1..99,999 == “1”..“99,999” in R, but 100,000 != “100,000”?

可紊 提交于 2019-12-29 01:34:55

问题


In the console, go ahead and try

> sum(sapply(1:99999, function(x) { x != as.character(x) }))
0

For all of values 1 through 99999, "1" == 1, "2" == 2, ..., 99999 == "99999" are TRUE. However,

> 100000 == "100000"
FALSE

Why does R have this quirky behavior, and is this a bug? What would be a workaround to, e.g., check if every element in an atomic character vector is in fact numeric? Right now I was trying to check whether x == as.numeric(x) for each x, but that fails on certain datasets due to the above problem!


回答1:


Have a look at as.character(100000). Its value is not equal to "100000" (have a look for yourself), and R is essentially just telling you so.

as.character(100000)
# [1] "1e+05"

Here, from ?Comparison, are R's rules for applying relational operators to values of different types:

If the two arguments are atomic vectors of different types, one is coerced to the type of the other, the (decreasing) order of precedence being character, complex, numeric, integer, logical and raw.

Those rules mean that when you test whether 1=="1", say, R first converts the numeric value on the LHS to a character string, and then tests for equality of the character strings on the LHS and RHS. In some cases those will be equal, but in other cases they will not. Which cases produce inequality will be dependent on the current settings of options("scipen") and options("digits")

So, when you type 100000=="100000", it is as if you were actually performing the following test. (Note that internally, R may well/probably does use something different than as.character() to perform the conversion):

as.character(100000)=="100000"
# [1] FALSE


来源:https://stackoverflow.com/questions/18964562/why-does-1-99-999-1-99-999-in-r-but-100-000-100-000

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!