Is IEEE 754-2008 deterministic?

拥有回忆 提交于 2021-02-19 01:31:06

问题


If I start with the same values, and perform the same primitive operations (addition, multiplication, comparision etc.) on double-precision 64-bit IEEE 754-2008 values, will I get the same result, independent of the underlying machine?

More concretely: Since ECMAScript 2015 specifies that a number values is

primitive value corresponding to a double-precision 64-bit binary format IEEE 754-2008 value

can I conclude that the same operations yield the same same result here, independent of the environment?


回答1:


(There are a lot of footnotes here to head off the well-actually crowd, but they don't affect your questions about ECMAScript.)

IEEE 754

If I start with the same values, and perform the same primitive operations (addition, multiplication, comparision etc.) on double-precision 64-bit IEEE 754-2008 values, will I get the same result, independent of the underlying machine?

Yes.

The IEEE 754-2008 (and IEEE 754-2019) standard precisely defines the addition, subtraction, multiplication, division, and square root operations on all floating-point values, except for distinctions between different NaN values.1 Implementations of the standard2 agree on all inputs. The same goes for three-way comparison (<, =, or >, defined on numbers, including infinities; raises exception on NaN) or four-way comparison (<, =, >, or unordered, defined on all floating-point values including NaN).

Not only are these five arithmetic operations precisely defined on all inputs, but for numeric inputs, they are precisely defined to be correctly rounded: the floating-point addition operation 𝑥 ⊕ 𝑦 is defined to give fl(𝑥 + 𝑦), which is the result of rounding the real number sum 𝑥 + 𝑦 according to the current rounding mode,3 which by default returns the nearest floating-point number, or, in the event of a tie, the nearest one whose least significant digit is even.

ECMAScript 2015 (and 2021)

More concretely: Since ECMAScript 2015 specifies that a number values is

primitive value corresponding to a double-precision 64-bit binary format IEEE 754-2008 value

can I conclude that the same operations yield the same same result here, independent of the environment?

Yes.

The operations +, -, *, and / on numbers in ECMAScript 2015 are all precisely defined on all inputs in agreement with IEEE 754.4 For example, the definition of addition in ECMAScript 2015 specifically states:

The result of an addition is determined using the rules of IEEE 754-2008 binary double-precision arithmetic:

The definition of addition in ECMAScript 2021 remains essentially the same, updated to cite IEEE 754-2019 instead:

The abstract operation Number::add takes arguments x (a Number) and y (a Number). It performs addition according to the rules of IEEE 754-2019 binary double-precision arithmetic, producing the sum of its arguments.

Similarly, equality in ECMAScript 2015 and equality in ECMAScript 2021 is defined in agreement with IEEE 754-2008 and IEEE 754-2019, although without an explicit citation. Relational operators in ECMAScript 2015 and relational operators in ECMAScript 2021 both implement the IEEE 754 notion of ordered comparison, returning false when either input is NaN and the appropriate ordering otherwise.

Math.sqrt in ECMAScript 2015, and Math.sqrt in ECMAScript 2021, is allowed to return an implementation-defined approximation (subject to constraints about corner cases) to the square root, even though IEEE 754 precisely defines the square root operation and has done so since the beginning in IEEE 754-1985. Practically speaking, though, it is extremely unlikely that an implementation will fail to return the correctly rounded result as required by IEEE 754.

Note: Many operations other than the four or five basic arithmetic operations (+, -, *, /; Math.sqrt) are allowed to, and very likely will, vary from implementation to implementation. For example, one implementation might use a simple polynomial approximation for Math.log1p, while another might use a table-driven set of approximations, giving slightly different results for some inputs. This is sometimes exploited as a vector for browser fingerprinting. But any approximation you implement using only the basic arithmetic operations will agree in all ECMAScript implementations.

The operator % in ECMAScript 2015 and % in ECMAScript 2021 is defined precisely for all inputs, but does not agree with the IEEE 754 remainder operation: ECMAScript % uses truncating division, whereas IEEE 754 remainder uses round-to-nearest/ties-to-even division. (ECMAScript % is fmod in C, whereas IEEE 754 remainder is remainder in C.)

Other Languages

The answers above do not always apply to other languages. For example, the overwhelming majority of C implementations provide IEEE 754 binary64 arithmetic for double and binary32 arithmetic for float, but the C standard permits them to use different arithmetic rules within expressions, provided that they specify what the rules are through the FLT_EVAL_METHOD macro:

Except for assignment and cast (which remove all extra range and precision), the values yielded by operators with floating operands and values subject to the usual arithmetic conversions and of floating constants are evaluated to a format whose range and precision may be greater than required by the type. The use of evaluation formats is characterized by the implementation-defined value of FLT_EVAL_METHOD:

  • -1 indeterminable;
  • 0 evaluate all operations and constants just to the range and precision of the type;
  • 1 evaluate operations and constants of type float and double to the range and precision of the double type, evaluate long double operations and constants to the range and precision of the long double type;
  • 2 evaluate all operations and constants to the range and precision of the long double type.

All other negative values for FLT_EVAL_METHOD characterize implementation-defined behavior.

(C11, §5.2.4.2.2: Characteristics of floating types <float.h>, ¶9, p. 30)

What this means is that when an implementation defines FLT_EVAL_METHOD to 2, a function like

double
naive_fma(double x, double y, double z)
{
    return x*y + z;
}

will be implemented as if it had been written:

double
naive_fma(double x, double y, double z)
{
    return (long double)x*z + z;
}

Implementations of C on the Intel IA-32 architecture (“i386”) often work this way: they use the Intel x87 floating-point unit to evaluate expressions in 80-bit binary floating-point arithmetic with 64 bits of precision (“double-extended precision”), and then round to IEEE 754 binary64 wherever the results are stored in a double variable, passed as a double, argument, or explicitly cast to double.5

However, this approach to evaluating expressions is not allowed in ECMAScript, so you don't have to worry about it. An implementation of C that works by compiling to ECMAScript the obvious way would simply define FLT_EVAL_METHOD to be 0.


1 The content of NaN payloads may vary from implementation to implementation. However, whether the result is a NaN, and whether a NaN result is signaling or quiet, is defined by the standard.

2 Some hardware also provides nonstandard modes of operation like flush-to-zero, which causes operations to return zero when under IEEE 754 semantics they would return subnormal numbers; in that case the hardware is not an implementation of the standard. If you enable these modes then you may get different answers, but normally they are not enabled, and they violate theorems often assumed by numerical algorithms such as the Sterbenz lemma, so they are only used in specialized applications. ECMAScript does not support flush-to-zero or other nonstandard modes of operation, nor do any implementations of which I am aware: you can rely on gradual underflow to subnormals as defined in IEEE 754.

3 IEEE 754 allows the implementation to maintain a dynamic rounding mode, with four rounding directions defined: to-nearest/ties-to-even, up (toward positive infinity), down (toward negative infinity), and toward zero. In some environments, programs can query and change the current rounding mode, such as in C with fegetround and fesetround, though toolchain support for this is often limited and it serves mainly to inject small perturbations into numerical algorithms to check for drastic changes in the output indicating problems in the algorithm. ECMAScript does not support changing the rounding mode, nor do any implementations of which I am aware: you only have to deal with the default round-to-nearest/ties-to-even.

4 The semantics of ECMAScript distinguishes only a single NaN value; there is no concept in ECMAScript of NaN payloads or of signaling vs. quiet NaN. Under the hood, two NaNs may be stored with different bit patterns, but ECMAScript does not distinguish them semantically, and provides no way to discriminate between them or examine the bit patterns under the hood.

5 Evaluating expressions in higher precision can sometimes lead to errors from double-rounding—e.g., add 0x1p+53 and 0x1.7ffp+1, and the first rounding to 64-bit precision will give 0x1.000000000000018p+53 so the second rounding to 53-bit precision gives 0x1.00000000000002p+53, whereas the correctly rounded sum with 53-bit precision is 0x1.00000000000001p+53. So why do it? In practice, it almost always leads to better accuracy in numerical algorithms, by using higher intermediate precision: you can afford to lose thousands of ulps with 64-bit precision and still get an answer that's within a couple ulps for 53-bit precision.



来源:https://stackoverflow.com/questions/42181795/is-ieee-754-2008-deterministic

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!