Does std::scientific always result in normalized scientific notation for floating-point numbers?

柔情痞子 提交于 2021-01-27 07:06:39

问题


Scientific notation defines how numbers should be displayed using a sign, a number and an exponent but it does not state that the visualization is normalized.

An example: -2.34e-2 (normalized scientific notation) is the same as -0.234e-1 (scientific notation)

Can I rely on the following code always producing the normalized outcome? Edit: except NAN and INF as pointed out in the answers.

template<typename T>
static std::string toScientificNotation(T number, unsigned significantDigits)
{
    if (significantDigits > 0) {
        significantDigits--;
    }
    std::stringstream ss;
    ss.precision(significantDigits);
    ss << std::scientific << number;
    return ss.str();
}

If yes, please list a section in the C++ documentation/standard stating that it is not platform/implementation-defined. Since the value of 0 is also represented differently I'm afraid that certain very small numbers (denormalized?!) could be visualized differently. On my platform with my compiler it currently works for std::numeric_limits::min(), denorm_min().

Note: I use this to find the order of magnitude of a number without messing with all the quirky details of floating point number analysis. I wanted the standard library do it for me :-)


回答1:


Yes, except for zero, infinity and NaN.

The C++ standard refers to the C standard for formatting, which requires normalized scientific notation.

  • [floatfield.manip]/2

    ios_base& scientific(ios_base& str);
    

    Effects: Calls str.setf(ios_­base​::​scientific, ios_­base​::​floatfield).

    Returns: str.

  • [ostream.inserters.arithmetic]/1 (partial)

    operator<<(float val);
    operator<<(double val);
    operator<<(long double val);
    

    Effects: The classes num_­get<> and num_­put<> handle locale-dependent numeric formatting and parsing. These inserter functions use the imbued locale value to perform numeric formatting. When val is of type ..., double, long double, ..., the formatting conversion occurs as if it performed the following code fragment:

    bool failed = use_facet<
      num_put<charT, ostreambuf_iterator<charT, traits>>
        >(getloc()).put(*this, *this, fill(), val).failed();
    

    When val is of type float the formatting conversion occurs as if it performed the following code fragment:

    bool failed = use_facet<
      num_put<charT, ostreambuf_iterator<charT, traits>>
        >(getloc()).put(*this, *this, fill(),
          static_cast<double>(val)).failed();
    
  • [facet.num.put.virtuals]/1:5.1 (partial)

    • Stage 1:

      The first action of stage 1 is to determine a conversion specifier. The tables that describe this determination use the following local variables

      fmtflags flags = str.flags();
      fmtflags floatfield = (flags & (ios_base::floatfield));
      

      For conversion from a floating-point type, the function determines the floating-point conversion specifier as indicated in Table 70.

      Table 70 — Floating-point conversions

      | State                                            | stdio equivalent |
      | ------------------------------------------------ | ---------------- |
      | floatfield == ios_­base​::​scientific && !uppercase | %e               |
      | floatfield == ios_­base​::​scientific               | %E               |
      

      The representations at the end of stage 1 consists of the char's that would be printed by a call of printf(s, val) where s is the conversion specifier determined above.

  • C11 n1570 [7.21.6.1]:8.4

    • e,E

      A double argument representing a floating-point number is converted in the style [−]d.ddde±dd, where there is one digit (which is nonzero if the argument is nonzero) before the decimal-point character and the number of digits after it is equal to the precision; if the precision is missing, it is taken as 6; if the precision is zero and the # flag is not specified, no decimal-point character appears. The value is rounded to the appropriate number of digits. The E conversion specifier produces a number with E instead of e introducing the exponent. The exponent always contains at least two digits, and only as many more digits as necessary to represent the exponent. If the value is zero, the exponent is zero.

      A double argument representing an infinity or NaN is converted in the style of an f or F conversion specifier.




回答2:


Can I rely on the following code always producing the normalized outcome?

There are no guarantee of it, no. Better said: the Standard does not impose a guarantee as strong as you wish here was.

std::scientific is only quoted on the following relevant parts:

  1. [floatfield.manip]:2

    ios_base& scientific(ios_base& str);  
    

    Effects: Calls str.setf(ios_­base​::​scientific, ios_­base​::​floatfield).
    Returns: str.

  2. Table 101 — fmtflags effects

    | Element    | Effect(s) if set                                       |
    | ...        | ...                                                    |
    | scientific | generates floating-point output in scientific notation |
    | ...        | ...                                                    |
    


来源:https://stackoverflow.com/questions/50583676/does-stdscientific-always-result-in-normalized-scientific-notation-for-floatin

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!