epsilon for various float values

一世执手 提交于 2019-11-29 18:13:55
Eric Postpischil

Caution: Bugs were found in this code while working on another answer. I hope to update this later. In the meantime, it fails for some values involving subnormals.

C provides a function for this, in the <math.h> header. nextafterf(x, INFINITY) is the next representable value after x, in the direction toward INFINITY.

However, if you'd prefer to do it yourself:

The following returns the epsilon you seek, for single precision (float), assuming IEEE 754.

#include <float.h>
#include <math.h>


/*  Return the ULP of q.

    This was inspired by Algorithm 3.5 in Siegfried M. Rump, Takeshi Ogita, and
    Shin'ichi Oishi, "Accurate Floating-Point Summation", _Technical Report
    05.12_, Faculty for Information and Communication Sciences, Hamburg
    University of Technology, November 13, 2005.
*/
float ULP(float q)
{
    // SmallestPositive is the smallest positive floating-point number.
    static const float SmallestPositive = FLT_EPSILON * FLT_MIN;

    /*  Scale is .75 ULP, so multiplying it by any significand in [1, 2) yields
        something in [.75 ULP, 1.5 ULP) (even with rounding).
    */
    static const float Scale = 0.75 * FLT_EPSILON;

    q = fabs(q);

    return fmax(SmallestPositive, q - (q - q * Scale));
}

The following returns the next value representable in float after the value it is passed (treating -0 and +0 as the same).

#include <float.h>
#include <math.h>


/*  Return the next floating-point value after the finite value q.

    This was inspired by Algorithm 3.5 in Siegfried M. Rump, Takeshi Ogita, and
    Shin'ichi Oishi, "Accurate Floating-Point Summation", _Technical Report
    05.12_, Faculty for Information and Communication Sciences, Hamburg
    University of Technology, November 13, 2005.
*/
float NextAfter(float q)
{
    // SmallestPositive is the smallest positive floating-point number.
    static const float SmallestPositive = FLT_EPSILON * FLT_MIN;

    /*  Scale is .625 ULP, so multiplying it by any significand in [1, 2)
        yields something in [.625 ULP, 1.25 ULP].
    */
    static const float Scale = 0.625 * FLT_EPSILON;

    return q + fmax(SmallestPositive, fabs(q)*Scale);
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!