How to distinguish between 1 and zero floating-point values?

与世无争的帅哥 提交于 2019-12-11 15:09:59

问题


I know it might be a noob question but since it's not covered in the e-book I'm studying I'm gonna ask it. In IEEE standard binary floating-point format actually representing numbers by scientific notation I know that an integer value of one is always assumed to be added to the fractional part illustrated by the significand part of the binary and is not included in the binary, So what confuses me is how to How to distinguish between 1 and zero floating-point values because I assume both have a totally zero significand and I guess the differentiation here should be done by exponent part but don't know how!


回答1:


For the zeroes (there are two, a positive and a negative zero that differ in the sign bit but must be considered equal), the significand and the exponent are all 0-bits, whereas for non-zero values at least one of them has a 1-bit (for a value of 1, the exponent is all 1-bits except for the most significant one).

The Wikipedia article on the IEEE 754 standard lists the exact bit patterns.




回答2:


I wrote an answer mentioning (among other things) the implicit bit (which is what I assume you're wondering about) here https://stackoverflow.com/questions/327020/why-are-floating-point-values-so-prolific/4164252#4164252

I'll expand on it further here. I'll use the character sequences "<=>" and "=>" to mean "equivalent to" and "giving".

If you look at the iEEE-754 single-precision floating point (SPFP) number in 32-bit unsigned integer format this is how to extract the individual parts:

  • Sign: AND with 0x80000000 (1 bit) and shift right 31 places
  • Exponent: AND with 0x7f800000 (8 bits) and shift right 23 places
  • Significand (mantissa): AND with 0x007fffff (23 bits). If the original floating-point number is non-zero you OR in the "implicit" bit with 0x00800000 (=> 24 bits in significand).

There are two variants of zero: 0.0 and -0.0 (0x00000000 and 0x80000000). Exponent = 0 and significand = 0 define a zero. In the same manner there are also two variants of one: 1.0 and -1.0 (0x3f800000 and 0xbf800000). As you can see there is no confusing 0.0 and 1.0. I'll try to explain why.

Any non-zero number will have an exponent in the range 0x01 to 0xfe. Somewhat over-simplified the exponent 0x00 with a non-zero significand is used for the underflow result case and exponent 0xff with a non-zero significand for the overflow result case (i e SPFP exceptions). The exponent corresponding to 1.0 is 0x7f which corresponds to 0 (see next paragraph) which gives 2^0 = 1. The next exponent just below is 0x7e and corresponds to -1 which gives 2^-1 = 0.5 and so on. For the exponent 0x7f the significand will attempt to represent all numbers in the range 1.0 <= x < 2.0 which is to say that the exponent defines the lower end of the numbers you want to represent which can go up to but not including the next higher 2's exponent.

If you find the exponent difficult to understand you and want it to appear "more normal" (being a base 10 person) you can subtract 0x7f (127) from it and you will get the range -126 to 127. -128 will be the overflow exponent and -127 the underflow.

Just so you don't think I've forgotten: if you have the sign bit set the exponent 0x7f will attempt to represent all numbers in the range -1.0 >= x > -2.0.

Now to the implicit bit. The implicit bit can be called bit "22.5" since it is right in-between the highest explicit bit of the significand and the lowest explicit bit of the exponent. Its implication is a 1 for the exponent position. So for exponent 0x7f (<=> 0 => 2^0) it implies that 1.0 is a component of the real number being represented. The first explicit bit to the right of it (bit 22 of the mantissa) signals if the number corresponding to the next smaller exponent (07f-0x01 = 0x7e <=> -1 => 2^-1) or 0.5 is a component of the real number and so on. The smallest component of a single precision floating point value with an exponent of 0x7f is therefore 0x7f - 23 (bits in significand) = 0x68 (<=> -23 => 2^-23).

To put it all together: the real number corresponding to the SPFP value 0x42b80000 is exponent 0x85-0x7f = 6 => 64.0 for the implicit bit:

  • 2^6 * 1 (implicit bit always 1) +
  • 2^5 * 0 (bit 22 of significand is reset) +
  • 2^4 * 1 (bit 21 is set) +
  • 2^3 * 1 (bit 20 is set) +
  • 2^2 * 1 (bit 19 is set) +
  • (bits 18 to 0 are reset and their corresponding components (2^1 to 2^-17) are therefore not used)

2^6+2^4+2^3+2^2 => 64+16+8+4 => 92.0 which is the real number represented by 0x42b80000.

In this example you can see how/that the significand is left-adjusted which allows the implicit bit 22.5 of the SPFP format to become explicit bit 23 (though always set) of the significand, thereby adding an additional bit of precision to the SPFP format. The DPFP (Double Precision) format is similar but the exponent range is larger and the significand longer.

I recommend you do some experimenting on the format. My personal guess is that 99% of all programmers never have.



来源:https://stackoverflow.com/questions/4827176/how-to-distinguish-between-1-and-zero-floating-point-values

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!