Fastest Implementation of Exponential Function Using AVX

前端 未结 4 1178
忘掉有多难
忘掉有多难 2020-11-29 08:42

I\'m looking for an efficient (Fast) approximation of the exponential function operating on AVX elements (Single Precision Floating Point). Namely - __m256 _mm256_exp_

4条回答
  •  野趣味
    野趣味 (楼主)
    2020-11-29 09:25

    You can approximate the exponent yourself with Taylor series:

    exp(z) = 1 + z + pow(z,2)/2 + pow(z,3)/6 + pow(z,4)/24 + ...
    

    For that you need only addition and multiplication operations from AVX. Coefficients like 1/2, 1/6, 1/24 etc. are faster if hard-coded and then multiplied by rather than divided.

    Take as many members of the sequence as required by your precision. Note that you will get relative error: for small z it may be 1e-6 in the absolute, but for large z it will be more than 1e-6 in the absolute, still abs(E-E1)/abs(E) - 1 is smaller than 1e-6 (where E is the precise exponent and E1 is what you get with approximation).

    UPDATE: As @Peter Cordes has mentioned in a comment, precision can be improved by separating exponentiation of integer and fractional parts, handling the integer part by manipulating the exponent field of the binary float representation (which is based on 2^x, not e^x). Then your Taylor series only has to minimize error over a small range.

提交回复
热议问题