Initializing LSTM hidden state Tensorflow/Keras

ぐ巨炮叔叔 提交于 2019-11-28 08:43:38

Yes - this is possible but truly cumbersome. Let's go through an example.

  1. Defining a model:

    from keras.layers import LSTM, Input
    from keras.models import Model
    
    input = Input(batch_shape=(32, 10, 1))
    lstm_layer = LSTM(10, stateful=True)(input)
    
    model = Model(input, lstm_layer)
    model.compile(optimizer="adam", loss="mse")
    

    It's important to build and compile model first as in compilation the initial states are reset. Moreover - you need to specify a batch_shape where batch_size is specified as in this scenario our network should be stateful (which is done by setting a stateful=True mode.

  2. Now we could set the values of initial states:

    import numpy
    import keras.backend as K
    
    hidden_states = K.variable(value=numpy.random.normal(size=(32, 10)))
    cell_states = K.variable(value=numpy.random.normal(size=(32, 10)))
    
    model.layers[1].states[0] = hidden_states
    model.layers[1].states[1] = cell_states 
    

    Note that you need to provide states as a keras variables. states[0] holds hidden states and states[1] holds cell states.

Hope that helps.

Assuming an RNN is in layer 1 and hidden/cell states are numpy arrays. You can do this:

from keras import backend as K

K.set_value(model.layers[1].states[0], hidden_states)
K.set_value(model.layers[1].states[1], cell_states)

States can also be set using

model.layers[1].states[0] = hidden_states
model.layers[1].states[1] = cell_states

but when I did it this way my state values stayed constant even after stepping the RNN.

I used this approach, totally worked out for me:

lstm_cell = LSTM(cell_num, return_state=True)

output, h, c = lstm_cell(input, initial_state=[h_prev, c_prev])

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!