Error due to limited precision of float and double

做~自己de王妃 提交于 2020-01-13 17:46:27

问题


In C++, I use the following code to work out the order of magnitude of the error due to the limited precision of float and double:

 float n=1;
 float dec  = 1;

 while(n!=(n-dec)) {
    dec = dec/10;
 }
 cout << dec << endl;

(in the double case all I do is exchange float with double in line 1 and 2)

Now when I compile and run this using g++ on a Unix system, the results are

Float  10^-8
Double 10^-17

However, when I compile and run it using MinGW on Windows 7, the results are

Float  10^-20
Double 10^-20

What is the reason for this?


回答1:


I guess I'll make my comment an answer and expand on it. This is my hypothesis, I may be wrong.

MinGW on Windows is probably trying to preserve precision by promoting the intermediates of expressions to the full 80-bit precision of x86.

Therefore, both sides of the expression n != (n-dec) are evaluated to 64-bits of precision (80-bit FP has a 64-bit mantissa).

2^-64 ~ 10^-20

So the numbers make sense.

Visual Studio also (by default), will promote intermediates. But only up to double-precision.




回答2:


Why dont you check the size of float and double in both os?




回答3:


This simply shows that the different environments use different sizes for float and double.

According to the C++ specification, double has to be at least as large as float. If you want to find out just how large the types are on your system, use sizeof.

What your tests seem to indicate is that g++ uses separate sizes for float and double (32 and 64 bits respectively) while MinGW32 on your Windows system uses the same size for both. Both versions are standard conforming and neither behaviour can be relied upon in general.



来源:https://stackoverflow.com/questions/7702177/error-due-to-limited-precision-of-float-and-double

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!