Fast 1/X division (reciprocal)

后端 未结 6 1297
忘掉有多难
忘掉有多难 2020-12-29 22:46

Is there some way to improve reciprocal (division 1 over X) with respect to speed, if the precision is not crucial?

So, I need to calculate 1/X. Is there so

6条回答
  •  青春惊慌失措
    2020-12-29 23:19

    This should do it with a number of pre-unrolled newton iterations's evaluated as a Horner polynomial which uses fused-multiply accumulate operations most modern day CPU's execute in a single Clk cycle (every time):

    float inv_fast(float x) {
        union { float f; int i; } v;
        float w, sx;
        int m;
    
        sx = (x < 0) ? -1:1;
        x = sx * x;
    
        v.i = (int)(0x7EF127EA - *(uint32_t *)&x);
        w = x * v.f;
    
        // Efficient Iterative Approximation Improvement in horner polynomial form.
        v.f = v.f * (2 - w);     // Single iteration, Err = -3.36e-3 * 2^(-flr(log2(x)))
        // v.f = v.f * ( 4 + w * (-6 + w * (4 - w)));  // Second iteration, Err = -1.13e-5 * 2^(-flr(log2(x)))
        // v.f = v.f * (8 + w * (-28 + w * (56 + w * (-70 + w *(56 + w * (-28 + w * (8 - w)))))));  // Third Iteration, Err = +-6.8e-8 *  2^(-flr(log2(x)))
    
        return v.f * sx;
    }
    

    Fine Print: Closer to 0, the approximation does not do so well so either you the programmer needs to test the performance or restrict the input from getting to low before resorting to hardware division. i.e. be responsible!

提交回复
热议问题