How to calculate the number of parameters of an LSTM network?

后端 未结 6 542
情歌与酒
情歌与酒 2020-12-04 20:37

Is there a way to calculate the total number of parameters in a LSTM network.

I have found a example but I\'m unsure of how correct this is or If I have understood i

6条回答
  •  执笔经年
    2020-12-04 21:06

    LSTM Equations (via deeplearning.ai Coursera)

  • It is evident from the equations that the final dimensions of all the 6 equations will be same and      final dimension must necessarily be equal to the dimension of a(t).

  • Out of these 6 equations, only 4 equations contribute to the number of parameters and by      looking at the equations, it can be deduced that all the 4 equations are symmetric. So,if we find      out the number of parameters for 1 equation, we can just multiply it by 4 and tell the total number      of parameters.

  • One important point is to note that the total number of parameters doesn't depend on the      time-steps(or input_length) as same "W" and "b" is shared throughout the time-step.

  • Assuming, insider of LSTM cell having just one layer for a gate(as that in Keras).

  • Take equation 1 and lets relate. Let number of neurons in the layer be n and number of      dimension of x be m (not including number of example and time-steps). Therefore, dimension of      forget gate will be n too. Now,same as that in ANN, dimension of "Wf" will be n*(n+m) and      dimension of "bf" will be n. Therefore, total number of parameters for one equation will be       [{n*(n+m)} + n]. Therefore, total number of parameters will be 4*[{n*(n+m)} + n].Lets open the      brackets and we will get -> 4*(nm + n2 + n).

  • So,as per your values. Feeding it into the formula gives:->(n=256,m=4096),total number of      parameters is 4*((256*256) + (256*4096) + (256) ) = 4*(1114368) = 4457472.

提交回复
热议问题