How does float guarantee 7 digit precision?

与世无争的帅哥 提交于 2021-02-20 04:23:25

问题


As I know Single-precision floating-point number has 1 bit for sign, 8 bits for exponent and 23 bits for mantissa.

I can understand that 7 digit integers fit 23 bit mantissa and don't loose precision but can't understand how a number like 1234567000000000 fits without loose "1,2,3,4,5,6,7" digits, what is the math behind this?


回答1:


The IEEE-754 basic 32-bit binary floating-point format only guarantees that six significant decimal digits will survive a round-trip conversion, not seven. Specifically: If you convert a number that is exactly represented with six decimal digits multiplied by a power of ten to the binary format using correct round-to-nearest rounding, and there is no overflow or underflow, and then you convert back to the nearest number representable with six decimal digits times a power of ten, the result will be the original number.

Generally, when a decimal numeral is converted to binary floating-point, the result might not have the same digits when written in decimal. Your example, 1234567000000000, converts to 1234567008616448, but we might find some case where 123456000… converts to 123455900…, so one of the original digits is different. But the precision supplied by the binary format is such that the result of converting to the binary format is always so near the original value that the difference is never more than half the position value of the sixth digit. For example, converting 123456000… will always produce a result between 123455500… and 123456500… Since the result of the first conversion is always within such an interval, converting it back to six decimal digits, with rounding, always produces the original number.

In order to make this guarantee, the precision of the format must be as fine as one part in 999999. That is because then the numbers in the binary format are so finely spaced that there is at least one in the interval from 9999985… to 9999995…, so the result of converting from decimal to binary can produce a result close enough to the original that converting back produces the original value. With 24 bits in the significand (one implied, 23 explicit), the precision is at least one part in 223, which is 8,388,608. (The significand can go up to 16,777,215, but we do not have a choice about it—it must be normalized, which keeps it in the range from 8,388,608 to 16,777,215.)

One part in 8,388,608 is better than one in 999,999, so six digits can be guaranteed, but it is not better than one in 9,999,999, so seven digits are not guaranteed.




回答2:


I see a math proof like this:

-log10(2^(-24)) = 7.225

24 means include the 1 before the point. if not include the 1 before the point:

-log10(2^(-23)) = 6.9



回答3:


[I] can't understand how a number like 1234567000000000 fits without loose "1,2,3,4,5,6,7" digits, what is the math behind this?

I don't quite understand your reasoning here. Anyway, here's how 1234567000000000 would be converted to a IEEE-754 binary32 (aka. single, aka float in C):

1.09651577472686767578125 * 2**50

Exponent is applied a bias (-127), mantissa and exponent are encoded in base 2, first bit of mantissa is dropped because it's always 1.

Check https://babbage.cs.qc.cuny.edu/IEEE-754/ for yourself

As you could notice, the decimal value after conversion to a float is not equal to 1234567000000000, but about to 1234567008616448, and you can see the first 7 digits matches.

The loss of precision happen in the less significant digits due to the limited amount of bits used to encode the mantissa (also due to the conversion to base 2).



来源:https://stackoverflow.com/questions/50491808/how-does-float-guarantee-7-digit-precision

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!