Fast sigmoid algorithm

后端未结

关注

 11  1883

庸人自扰

The sigmoid function is defined as

I found that using the C built-in function exp() to calculate the value of f(x) is slow. Is th

相关标签:

11条回答

孤城傲影

2020-12-12 16:09
This answer probably isn't relevant for most cases, but just wanted to throw out there that for CUDA computing I've found x/sqrt(1+x^2) to be the fastest function by far.

For example, done with single precision float intrinsics:
```
__device__ void fooCudaKernel(/* some arguments */) {
    float foo, sigmoid;
    // some code defining foo
    sigmoid = __fmul_rz(rsqrtf(__fmaf_rz(foo,foo,1)),foo);
}
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
甜味超标

2020-12-12 16:12
you don't have to use the actual, exact sigmoid function in a neural network algorithm but can replace it with an approximated version that has similar properties but is faster the compute.

For example, you can use the "fast sigmoid" function
```
  f(x) = x / (1 + abs(x))
```
Using first terms of the series expansion for exp(x) won't help too much if the arguments to f(x) are not near zero, and you have the same problem with a series expansion of the sigmoid function if the arguments are "large".

An alternative is to use table lookup. That is, you precalculate the values of the sigmoid function for a given number of data points, and then do fast (linear) interpolation between them if you want.
0 讨论(0)
发布评论:

提交评论
- 加载中...
天命终不由人

2020-12-12 16:18

People here are mostly concerned about how fast one function is relative to another and create micro benchmark to see whether f1(x) runs 0.0001 ms faster than f2(x). The big problem is that this is mostly irrelevant, because what matters is how fast your network learns with your activation function trying to minimize your cost function.

As of current theory, rectifier function and softplus

compared to sigmoid function or similar activation functions, allow for faster and effective training of deep neural architectures on large and complex datasets.

So I suggest to throw away micro-optimization, and take a look at which function allows faster learning (also taking looking at various other cost function).

0 讨论(0)
发布评论:

提交评论
- 加载中...
慢半拍i

2020-12-12 16:22
You can also use this:
```
    y=x / (2 * (x<0.0?-x:x) + 2) + 0.5;
    y'=y(1-y);
```
acts like a sigmoid now because y(1-y)=y' is more let say round than 1/(2 (1 + abs(x))^2) acts more like to fast sigmoid;
0 讨论(0)
发布评论:

提交评论
- 加载中...
北海茫月

2020-12-12 16:22

I don't think you can do better than the built-in exp() but if you want another approach, you can use series expansion. WolframAlpha can compute it for you.

0 讨论(0)
发布评论:

提交评论
- 加载中...
萌比男神i

2020-12-12 16:24

The tanh function may be optimized in some languages, making it faster than a custom defined x/(1+abs(x)), such is the case in Julia.

0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页