The sigmoid function is defined as
I found that using the C built-in function exp() to calculate the value of f(x) is slow. Is th
This answer probably isn't relevant for most cases, but just wanted to throw out there that for CUDA computing I've found x/sqrt(1+x^2) to be the fastest function by far.
For example, done with single precision float intrinsics:
__device__ void fooCudaKernel(/* some arguments */) {
float foo, sigmoid;
// some code defining foo
sigmoid = __fmul_rz(rsqrtf(__fmaf_rz(foo,foo,1)),foo);
}