I\'d like to approximate the ex function.
Is it possible to do so using multiple splines type based approach? i.e between x1
This is not appropriate for custom FPGA, but worth mentioning.
http://www.machinedlearnings.com/2011/06/fast-approximate-logarithm-exponential.html
And the source code:
https://code.google.com/archive/p/fastapprox/downloads
The "faster" implementation only involves 3 steps (multiply, add, convert float to int) and a final cast back to float. In my experience, it is 2% accurate, which may be enough if you don't care about the actual value but are using the value in a log-likelihood maximization iteration.
Or you could just do pow(M_E, x)
in C. (Some platforms don't have M_E
defined; on those, you may have to manually specify the value of e, which is approximately 2.71828182845904523536028747135266249775724709369995
.)
(As David points out in the comments, exp(x)
would be more efficient than pow(M_E, x)
. Again, brain not turned on yet.)
Do you have a use case where the calculation of ex is a proven bottleneck? If not, you should be coding for readability first; only try these sorts of optimizations if the obvious approach is too slow.
For hardware, I have an awesome solution for you IF you need it to be bit-level accurate. (Else just do an approximation like above). The identity is exp(x) = cosh(x) + sinh(x), the hyperbolic sine and cosine. The catch is that the hyperbolic sine and cosine can be computed using the CORIC technique, and best of all, they are one of the FAST CORDIC functions, meaning they look almost like multiply instead of almost like divide!
Which means for about the area of an array multiplier, you can compute exponent to arbitrary precision in just 2 cycles!
Look up the CORDIC method - it's AMAZING for hardware implementation.
One other hardware approach is using a small table in conjunction with a formula others have mentioned: exp(x + y) = exp(x) * exp(y). You can break the number up into small bit fields - say 4 or 8 bits at a time - and just look up the exponent for that bitfield. Probably only effective for narrow computations, but it's another approach.
How about a strategy like this that uses the formula
ex = 2 x/ln(2)
1/ln(2)
I realize this is not a complete solution, but it does only require a single multiplication and reduces the remaining problem to approximating a fractional power of 2, which should be easier to implement in hardware.
Also, if your application is specialized enough, you could try to re-derive all of the numerical code that will run on your hardware to be in a base-e number system and implement your floating point hardware to work in base e as well. Then no conversion is needed at all.