multi-layer perceptron (MLP) architecture: criteria for choosing number of hidden layers and size of the hidden layer?

前端未结

关注

 4  1138

盖世英雄少女心 2020-11-28 17:18

If we have 10 eigenvectors then we can have 10 neural nodes in input layer.If we have 5 output classes then we can have 5 nodes in output layer.But what is the criteria for

4条回答

一整个雨季 (楼主)

2020-11-28 17:55

Recently there is theoretical work on this https://arxiv.org/abs/1809.09953. Assuming you use a RELU MLP, all hidden layers have the same number of nodes and your loss function and true function that you're approximating with a neural network obey some technical properties (in the paper), you can choose your depth to be of order $\log(n)$ and your width of hidden layers to be of order $n^{d/(2(\beta+d))}\log^2(n)$. Here $n$ is your sample size, $d$ is the dimension of your input vector, and $\beta$ is a smoothness parameter for your true function. Since $\beta$ is unknown, you will probably want to treat it as a hyperparameter.

Doing this you can guarantee that with probability that converges to $1$ as function of sample size your approximation error converges to $0$ as a function of sample size. They give the rate. Note that this isn't guaranteed to be the 'best' architecture, but it can at least give you a good place to start with. Further, my own experience suggests that things like dropout can still help in practice.

0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...