When does keras reset an LSTM state?

后端 未结 5 1473
逝去的感伤
逝去的感伤 2020-11-28 04:23

I read all sorts of texts about it, and none seem to answer this very basic question. It\'s always ambiguous:

In a stateful = False LSTM layer, does ker

5条回答
  •  囚心锁ツ
    2020-11-28 04:43

    In Keras there are two modes for maintaining states: 1) The default mode (stateful = False) where the state is reset after each batch. AFAIK the state will still be maintained between different samples within a batch. So for your example state would be reset for 5 times in each epoch.

    2) The stateful mode where the state is never reset. It is up to the user to reset state before a new epoch, but Keras itself wont reset the state. In this mode the state is propagated from sample "i" of one batch to sample"i" of the next batch. Generally it is recommended to reset state after each epoch, as the state may grow for too long and become unstable. However in my experience with small size datasets (20,000- 40,000 samples) resetting or not resetting the state after an epoch does not make much of a difference to the end result. For bigger datasets it may make a difference.

    Stateful model will be useful if you have patterns that span over 100s of time steps. Otherwise the default mode is sufficient. In my experience setting the batch size roughly equivalent to the size (time steps) of the patterns in the data also helps.

    The stateful setup could be quite difficult to grasp at first. One would expect the state to be transferred between the last sample of one batch to the first sample of the next batch. But the sate is actually propagated across batches between the same numbered samples. The authors had two choices and they chose the latter. Read about this here. Also look at the relevant Keras FAQ section on stateful RNNs

提交回复
热议问题