What is num_units in tensorflow BasicLSTMCell?

后端 未结 11 1626
北恋
北恋 2020-12-12 10:13

In MNIST LSTM examples, I don\'t understand what \"hidden layer\" means. Is it the imaginary-layer formed when you represent an unrolled RNN over time?

Why is the <

相关标签:
11条回答
  • 2020-12-12 10:46

    Most LSTM/RNN diagrams just show the hidden cells but never the units of those cells. Hence, the confusion. Each hidden layer has hidden cells, as many as the number of time steps. And further, each hidden cell is made up of multiple hidden units, like in the diagram below. Therefore, the dimensionality of a hidden layer matrix in RNN is (number of time steps, number of hidden units).

    0 讨论(0)
  • 2020-12-12 10:48

    The Concept of hidden unit is illustrated in this image https://imgur.com/Fjx4Zuo.

    0 讨论(0)
  • 2020-12-12 10:49

    This term num_units or num_hidden_units sometimes noted using the variable name nhid in the implementations, means that the input to the LSTM cell is a vector of dimension nhid (or for a batched implementation, it would a matrix of shape batch_size x nhid). As a result, the output (from LSTM cell) would also be of same dimensionality since RNN/LSTM/GRU cell doesn't alter the dimensionality of the input vector or matrix.

    As pointed out earlier, this term was borrowed from Feed-Forward Neural Networks (FFNs) literature and has caused confusion when used in the context of RNNs. But, the idea is that even RNNs can be viewed as FFNs at each time step. In this view, the hidden layer would indeed be containing num_hidden units as depicted in this figure:

    Source: Understanding LSTM


    More concretely, in the below example the num_hidden_units or nhid would be 3 since the size of hidden state (middle layer) is a 3D vector.

    0 讨论(0)
  • 2020-12-12 10:51

    An LSTM keeps two pieces of information as it propagates through time:

    A hidden state; which is the memory the LSTM accumulates using its (forget, input, and output) gates through time, and The previous time-step output.

    Tensorflow’s num_units is the size of the LSTM’s hidden state (which is also the size of the output if no projection is used).

    To make the name num_units more intuitive, you can think of it as the number of hidden units in the LSTM cell, or the number of memory units in the cell.

    Look at this awesome post for more clarity

    0 讨论(0)
  • 2020-12-12 10:52

    I think this is a correctly answer for your question. LSTM always make confusion.

    You can refer this blog for more detail Animated RNN, LSTM and GRU

    0 讨论(0)
提交回复
热议问题