发表新帖

发表新帖

Why must a nonlinear activation function be used in a backpropagation neural network? [closed]

后端未结

关注

 13  2219

傲寒 2020-11-29 15:07

13条回答

执笔经年 (楼主)

2020-11-29 15:30
A feed-forward neural network with linear activation and any number of hidden layers is equivalent to just a linear neural neural network with no hidden layer. For example lets consider the neural network in figure with two hidden layers and no activation
```
y = h2 * W3 + b3 
  = (h1 * W2 + b2) * W3 + b3
  = h1 * W2 * W3 + b2 * W3 + b3 
  = (x * W1 + b1) * W2 * W3 + b2 * W3 + b3 
  = x * W1 * W2 * W3 + b1 * W2 * W3 + b2 * W3 + b3 
  = x * W' + b'
```
We can do the last step because combination of several linear transformation can be replaced with one transformation and combination of several bias term is just a single bias. The outcome is same even if we add some linear activation.

So we could replace this neural net with a single layer neural net.This can be extended to n layers. This indicates adding layers doesn't increase the approximation power of a linear neural net at all. We need non-linear activation functions to approximate non-linear functions and most real world problems are highly complex and non-linear. In fact when the activation function is non-linear, then a two-layer neural network with sufficiently large number of hidden units can be proven to be a universal function approximator.
0 讨论(0)

查看其它13个回答
发布评论:

提交评论
- 加载中...

热议问题