How do you find a float's nearest non-equal value? [duplicate]

有些话、适合烂在心里 提交于 2019-11-30 11:21:46

The standard way to find a floating-point value's neighbors is the function nextafter for double and nextafterf for float. The second argument gives the direction. Remember that infinities are legal values in IEEE 754 floating-point, so you can very well call nextafter(x, +1.0/0.0) to get the value immediately above x, and this will work even for DBL_MAX (whereas if you wrote nextafter(x, DBL_MAX), it would return DBL_MAX when applied for x == DBL_MAX).

Two non-standard ways that are sometimes useful are:

  1. access the representation of the float/double as an unsigned integer of the same size, and increment or decrement this integer. The floating-point format was carefully designed so that for positive floats, and respectively for negative floats, the bits of the representation, seen as an integer, evolve monotonously with the represented float.

  2. change the rounding mode to upward, and add the smallest positive floating-point number. The smallest positive floating-point number is also the smallest increment that there can be between two floats, so this will never skip any float. The smallest positive floating-point number is FLT_MIN * FLT_EPSILON.


For the sake of completeness, I will add that even without changing the rounding mode from its “to nearest” default, multiplying a float by (1.0f + FLT_EPSILON) produces a number that is either the immediate neighbor away from zero, or the neighbor after that. It is probably the cheapest if you already know the sign of the float you wish to increase/decrease and you don't mind that it sometimes does not produce the immediate neighbor. Functions nextafter and nextafterf are specified in such a way that a correct implementation on the x86 must test for a number of special values and FPU states, and is thus rather costly for what it does.

To go towards zero, multiply by 1.0f - FLT_EPSILON.

This doesn't work for 0.0f, obviously, and generally for the smaller denormalized numbers.

The values for which multiplying by 1.0f + FLT_EPSILON advance by 2 ULPS are just below a power of two, specifically in the interval [0.75 * 2p … 2p). If you don't mind doing a multiplication and an addition, x + (x * (FLT_EPSILON * 0.74)) should work for all normal numbers (but still not for zero nor for all the small denormal numbers).

Look at the "nextafter" function, which is part of Standard C (and probably C++, but I didn't check).

I tried it out on my machine. And all three approaches:
1. adding with 1 and memcopying
2. adding FLT_EPSILON
3. multiplying by (1.0f + FLT_EPSILON)
seems to give the same answer.


see the result here
bash-3.2$ cc float_test.c -o float_test; ./float_test 1.023456 10
Original num: 1.023456
int added = 1.023456 01-eps added = 1.023456 mult by 01*(eps+1) = 1.023456
int added = 1.023456 02-eps added = 1.023456 mult by 02*(eps+1) = 1.023456
int added = 1.023456 03-eps added = 1.023456 mult by 03*(eps+1) = 1.023456
int added = 1.023456 04-eps added = 1.023456 mult by 04*(eps+1) = 1.023456
int added = 1.023457 05-eps added = 1.023457 mult by 05*(eps+1) = 1.023457
int added = 1.023457 06-eps added = 1.023457 mult by 06*(eps+1) = 1.023457
int added = 1.023457 07-eps added = 1.023457 mult by 07*(eps+1) = 1.023457
int added = 1.023457 08-eps added = 1.023457 mult by 08*(eps+1) = 1.023457
int added = 1.023457 09-eps added = 1.023457 mult by 09*(eps+1) = 1.023457
int added = 1.023457 10-eps added = 1.023457 mult by 10*(eps+1) = 1.023457

Code
#include <float.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>

int main(int argc, char *argv[])
{

    if(argc != 3) {
        printf("Usage: <binary> <floating_pt_num> <num_iter>\n");
        exit(0);
    }

    float f = atof(argv[1]);
    int count = atoi(argv[2]);

    assert(count > 0);

    int i;
    int num;
    float num_float;

    printf("Original num: %f\n", f);
    for(i=1; i<=count; i++) {
        memcpy(&num, &f, 4);
        num += i;
        memcpy(&num_float, &num, 4);
        printf("int added = %f \t%02d-eps added = %f \tmult by %2d*(eps+1) = %f\n", num_float, i, f + i*FLT_EPSILON, i, f*(1.0f + i*FLT_EPSILON));
    }

    return 0;
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!