Fastest implementation of sine, cosine and square root in C++ (doesn't need to be much accurate)

前端 未结 15 1986
执笔经年
执笔经年 2020-12-04 10:55

I am googling the question for past hour, but there are only points to Taylor Series or some sample code that is either too slow or does not compile at all. Well, most answe

15条回答
  •  死守一世寂寞
    2020-12-04 11:48

    Based on the idea of http://forum.devmaster.net/t/fast-and-accurate-sine-cosine/9648 and some manual rewriting to improve the performance in a micro benchmark I ended up with the following cosine implementation which is used in a HPC physics simulation that is bottlenecked by repeated cos calls on a large number space. It's accurate enough and much faster than a lookup table, most notably no division is required.

    template
    inline T cos(T x) noexcept
    {
        constexpr T tp = 1./(2.*M_PI);
        x *= tp;
        x -= T(.25) + std::floor(x + T(.25));
        x *= T(16.) * (std::abs(x) - T(.5));
        #if EXTRA_PRECISION
        x += T(.225) * x * (std::abs(x) - T(1.));
        #endif
        return x;
    }
    

    The Intel compiler at least is also smart enough in vectorizing this function when used in a loop.

    If EXTRA_PRECISION is defined, the maximum error is about 0.00109 for the range -π to π, assuming T is double as it's usually defined in most C++ implementations. Otherwise, the maximum error is about 0.056 for the same range.

提交回复
热议问题