问题
I often end up in situations where it is necessary to check if the obtained difference is above machine precision. Seems like for this purpose R has a handy variable: .Machine$double.eps. However when I turn to R source code for guidelines about using this value I see multiple different patterns.
Examples
Here are a few examples from stats library:
t.test.R
if(stderr < 10 *.Machine$double.eps * abs(mx))
chisq.test.R
if(abs(sum(p)-1) > sqrt(.Machine$double.eps))
integrate.R
rel.tol < max(50*.Machine$double.eps, 0.5e-28)
lm.influence.R
e[abs(e) < 100 * .Machine$double.eps * median(abs(e))] <- 0
princomp.R
if (any(ev[neg] < - 9 * .Machine$double.eps * ev[1L]))
etc.
Questions
- How can one understand the reasoning behind all those different 10 *,100 *,50 *andsqrt()modifiers?
- Are there guidelines about using .Machine$double.epsfor adjusting differences due to precision issues?
回答1:
The machine precision for double depends on its current value. .Machine$double.eps gives the precision when the values is 1. You can use the C function nextAfter to get the machine precision for other values.
library(Rcpp)
cppFunction("double getPrec(double x) {
  return nextafter(x, std::numeric_limits<double>::infinity()) - x;}")
(pr <- getPrec(1))
#[1] 2.220446e-16
1 + pr == 1
#[1] FALSE
1 + pr/2 == 1
#[1] TRUE
1 + (pr/2 + getPrec(pr/2)) == 1
#[1] FALSE
1 + pr/2 + pr/2 == 1
#[1] TRUE
pr/2 + pr/2 + 1 == 1
#[1] FALSE
Adding value a to value b will not change b when a is <= half of it's machine precision. Checking if the difference is smaler than machine precision is done with <. The modifiers might consider typical cases how often an addition did not show a change.
In R the machine precision can be estimated with:
getPrecR <- function(x) {
  y <- log2(pmax(.Machine$double.xmin, abs(x)))
  ifelse(x < 0 & floor(y) == y, 2^(y-1), 2^floor(y)) * .Machine$double.eps
}
getPrecR(1)
#[1] 2.220446e-16
Each double value is representing a range. For a simple addition, the range of the result depends on the reange of each summand and also the range of their sum.
library(Rcpp)
cppFunction("std::vector<double> getRange(double x) {return std::vector<double>{
   (nextafter(x, -std::numeric_limits<double>::infinity()) - x)/2.
 , (nextafter(x, std::numeric_limits<double>::infinity()) - x)/2.};}")
x <- 2^54 - 2
getRange(x)
#[1] -1  1
y <- 4.1
getRange(y)
#[1] -4.440892e-16  4.440892e-16
z <- x + y
getRange(z)
#[1] -2  2
z - x - y #Should be 0
#[1] 1.9
2^54 - 2.9 + 4.1 - (2^54 + 5.9) #Should be -4.7
#[1] 0
2^54 - 2.9 == 2^54 - 2      #Gain 0.9
2^54 - 2 + 4.1 == 2^54 + 4  #Gain 1.9
2^54 + 5.9 == 2^54 + 4      #Gain 1.9
For higher precission Rmpfr could be used.
library(Rmpfr)
mpfr("2", 1024L)^54 - 2.9 + 4.1 - (mpfr("2", 1024L)^54 + 5.9)
#[1] -4.700000000000000621724893790087662637233734130859375
In case it could be converted to integer gmp could be used (what is in Rmpfr).
library(gmp)
as.bigz("2")^54 * 10 - 29 + 41 - (as.bigz("2")^54 * 10 + 59)
#[1] -47
回答2:
Definition of a machine.eps: it is the lowest value eps for which 1+eps is not 1  
As a rule of thumb (assuming a floating point representation with base 2):
This eps makes the difference for the range 1 .. 2,
for the range 2 .. 4 the precision is 2*eps
and so on. 
Unfortunately, there is no good rule of thumb here. It's entirely determined by the needs of your program.
In R we have all.equal as a built in way to test approximate equality. So you could use maybe something like (x<y) | all.equal(x,y)
i <- 0.1
 i <- i + 0.05
 i
if(isTRUE(all.equal(i, .15))) { #code was getting sloppy &went to multiple lines
    cat("i equals 0.15\n") 
} else {
    cat("i does not equal 0.15\n")
}
#i equals 0.15
Google mock has a number of floating point matchers for double precision comparisons, including DoubleEq and DoubleNear. You can use them in an array matcher like this:
ASSERT_THAT(vec, ElementsAre(DoubleEq(0.1), DoubleEq(0.2)));
Update:
Numerical Recipes provide a derivation to demonstrate that using a one-sided difference quotient, sqrt is a good choice of step-size for finite difference approximations of derivatives.
The Wikipedia article site Numerical Recipes, 3rd edition, Section 5.7, which is pages 229-230 (a limited number of page views is available at http://www.nrbook.com/empanel/).
all.equal(target, current,
           tolerance = .Machine$double.eps ^ 0.5, scale = NULL,
           ..., check.attributes = TRUE)
These IEEE floating point arithmetic is a well known limitation of computer arithmetic and is discussed in several places:
- The FAQ in R has a whole question dedicated to it: R FAQ 7.31
- The R Inferno by Patrick Burns dedicated the first "Circle" to this problem (page 9 onward)
- Arithmetic Sum Proof Problem asked in Math Meta
- David Goldberg, "What Every Computer Scientist Should Know About Floating-point Arithmetic," ACM Computing Surveys 23, 1 (1991-03), 5-48 doi>10.1145/103162.103163 (revision also available)
- The Floating-Point Guide - What Every Programmer Should Know About Floating-Point Arithmetic
- 0.30000000000000004.com compares floating point arithmetic across programming languages
 
- Canonical duplicate for "floating point is inaccurate" (a meta discussion about a canonical answer for this issue)
- Several Stack Overflow questions including
- Why can't decimal numbers be represented exactly in binary?
- Why are floating point numbers inaccurate?
- Is floating point math broken?
 
- Math Tricks Explained by Arthur T. Benjamin
.
   dplyr::near() is another option for testing if two vectors of floating point numbers are equal. 
The function has a built in tolerance parameter: tol = .Machine$double.eps^0.5 that can be adjusted. The default parameter is the same as the default for all.equal().
来源:https://stackoverflow.com/questions/59229545/what-is-the-correct-standard-way-to-check-if-difference-is-smaller-than-machine