Variable Input for Sequence to Sequence Autoencoder

六眼飞鱼酱① 提交于 2019-12-11 16:53:53

问题


I implemented a Sequence to Sequence Encoder Decoder but I am having problems with varying my target length in the prediction. It is working for the same length of the training sequence but not if it is different. What do I need to change ?

from keras.models import Model
from keras.layers import Input, LSTM, Dense
import numpy as np

num_encoder_tokens = 2
num_decoder_tokens = 2
encoder_seq_length = None
decoder_seq_length = None
batch_size = 100
epochs = 2000
hidden_units=10
timesteps=10

input_seqs = np.random.random((1000, 10, num_encoder_tokens))
target_seqs = np.random.random((1000, 10, num_decoder_tokens))



#define training encoder
encoder_inputs = Input(shape=(None, num_encoder_tokens))
encoder = LSTM(hidden_units, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
encoder_states = [state_h, state_c]
#define training decoder
decoder_inputs = Input(shape=(None,num_decoder_tokens))
decoder_lstm = LSTM(hidden_units, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)
decoder_dense = Dense(num_encoder_tokens, activation='tanh')
decoder_outputs = decoder_dense(decoder_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

#Run training
model.compile(optimizer='adam', loss='mse')
model.fit([input_seqs, target_seqs], target_seqs,batch_size=batch_size, epochs=epochs)

#new target data
target_seqs = np.random.random((2000, 10, num_decoder_tokens))


# define inference encoder
encoder_model = Model(encoder_inputs, encoder_states)
# define inference decoder
decoder_state_input_h = Input(shape=(hidden_units,))
decoder_state_input_c = Input(shape=(hidden_units,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
decoder_outputs, state_h, state_c = decoder_lstm(decoder_inputs, initial_state=decoder_states_inputs)
decoder_states = [state_h, state_c]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model([decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states)

# Initalizse states 
states_values = encoder_model.predict(input_seqs)

and here it wants the same batchsize as in the input_seqs and does not accept target_seqs having a batch of 2000

target_seq = np.zeros((1, 1, num_decoder_tokens))
output=list()
for t in range(timesteps):
    output_tokens, h, c  = decoder_model.predict([target_seqs] + states_values)
    output.append(output_tokens[0,0,:])
    states_values = [h,c]
    target_seq = output_tokens

What do I need to change that the model accepts a variable length of input ?


回答1:


Unfortunately you cannot do that. You have to set your input to the maximum expected length. Then you can use a Masking layer with either an Embedding layer or using a masking value as

keras.layers.Masking(mask_value=0.0)

See more information here.




回答2:


You can create in your data a word/token that means end_of_sequence.

You keep the length to a maximum and probably use some Masking(mask_value) layer to avoid processing undesired steps.

In both the inputs and outputs, you add the end_of_sequence token and complete the missing steps with mask_value.

Example:

  • the longest sequence has 4 steps
    • make it 5 to add an end_of_sequence token:
      • [step1, step2, step3, step4, end_of_sequence]
  • consider a sequence that is shorter:
    • [step1, step2, end_of_sequence, mask_value, mask_value]

Then your shape will be (batch, 5, features).


Another approach is described in your other question, where the user loops each step manually and checks whether the result of this step is the end_of_sequence token: Difference between two Sequence to Sequence Models keras (with and without RepeatVector)

If this is an autoencoder, there is also another possibility for variable lengths, where you take the length directly from the input (must feed batches with only one sequence each, no padding/masking): How to apply LSTM-autoencoder to variant-length time-series data?

This is another approach where we store the input length explicitly in a reserved element of the latent vector and later we read this (must also run with only one sequence per batch, no padding): Variable length output in keras



来源:https://stackoverflow.com/questions/51501726/variable-input-for-sequence-to-sequence-autoencoder

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!