double type digits in C++

折月煮酒 提交于 2019-12-05 21:35:28

No. Counter-example: the two closest floating-point numbers to a rational

1.11111111111118

(which has 15 decimal digits) are

1.1111111111111799942818834097124636173248291015625
1.1111111111111802163264883347437717020511627197265625

In other words, there is not floating-point number that starts with 1.1111111111111800.

This question is a little malformed. The hardware stores the numbers in binary, not decimal. So in the general case you can't do precise math in base 10. Some decimal numbers (0.1 is one of them!) do not even have a non-repeating representation in binary. If you have precision requirements like this, where you care about the number being of known precision to exactly 15 decimal digits, you will need to pick another representation for your numbers.

No, but I wonder if this is relevant to any of your issues (GCC specific):

GCC Documentation

-ffloat-store Do not store floating point variables in registers, and inhibit other options that might change whether a floating point value is taken from a register or memory.

This option prevents undesirable excess precision on machines such as the 68000 where the floating registers (of the 68881) keep more precision than a double is supposed to have. Similarly for the x86 architecture. For most programs, the excess precision does only good, but a few programs rely on the precise definition of IEEE floating point. Use -ffloat-store for such programs, after modifying them to store all pertinent intermediate computations into variables.

You should be able to directly modify the bits in your number by creating a union with a field for the floating point number and an integral type of the same size. Then you can access the bits you want and set them however you want. Here is in example where I whack the sign bit; you can choose any field you want, of course.

#include <stdio.h>

union double_int {
  double             fp;
  unsigned long long integer;
};

int main(int argc, const char *argv[])
{
  double            my_double = 1325.34634;
  union double_int  *my_union = (union double_int *)&my_double;

  /* print original numbers */
  printf("Float   %f\n", my_double);
  printf("Integer %llx\n", my_union->integer);

  /* whack the sign bit to 1 */
  my_union->integer |= 1ULL << 63;

  /* print modified numbers */
  printf("Negative float   %f\n", my_double);
  printf("Negative integer %llx\n", my_union->integer);

  return 0;
}

Generally speaking, people only care about something like this ("I only want the first x digits") when displaying the number. That's relatively easy with stringstreams or sprintf.

If you're concerned about comparing numbers with ==; you really can't do that with floating point numbers. Instead you want to see if the numbers are close enough (say, within an epsilon() of each other).

Playing with the bits of the number directly isn't a great idea.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!