I\'m looking for an efficient (Fast) approximation of the exponential function operating on AVX elements (Single Precision Floating Point). Namely - __m256 _mm256_exp_
You can approximate the exponent yourself with Taylor series:
exp(z) = 1 + z + pow(z,2)/2 + pow(z,3)/6 + pow(z,4)/24 + ...
For that you need only addition and multiplication operations from AVX. Coefficients like 1/2, 1/6, 1/24 etc. are faster if hard-coded and then multiplied by rather than divided.
Take as many members of the sequence as required by your precision. Note that you will get relative error: for small z
it may be 1e-6
in the absolute, but for large z
it will be more than 1e-6
in the absolute, still abs(E-E1)/abs(E) - 1
is smaller than 1e-6
(where E
is the precise exponent and E1
is what you get with approximation).
UPDATE: As @Peter Cordes has mentioned in a comment, precision can be improved by separating exponentiation of integer and fractional parts, handling the integer part by manipulating the exponent field of the binary float
representation (which is based on 2^x, not e^x). Then your Taylor series only has to minimize error over a small range.