Tensorflow Sequence to sequence model using the seq2seq API ( ver 1.1 and above)

社会主义新天地 提交于 2019-12-02 18:18:28

Decoding layer:

The decoding consists of two parts because of their differences during training and inference:

The decoder input at a particular time-step always comes from the output of the previous time-step. But during training, the output is fixed to the actual target (the actual target is fed back as input) and this has shown to improve performance.

Both these are handled using methods from tf.contrib.seq2seq.

  1. The main function for the decoder is: seq2seq.dynamic decoder() which performs dynamic decoding:

    tf.contrib.seq2seq.dynamic_decode(decoder,maximum_iterations)

    This takes a Decoder instance and maximum_iterations=maximum seq length as inputs.

    1.1 The Decoder instance is from:

    seq2seq.BasicDecoder(cell, helper, initial_state,output_layer)

    The inputs are: cell (an RNNCell instance), helper (helper instance), initial_state (initial state of the decoder which should be the output state of the encoder) and output_layer (an optional dense layer as outputs to makes predictions)

    1.2 An RNNCell instance can be a rnn.MultiRNNCell().

    1.3 The helper instance is the one that differs in training and inference. During training, we want the inputs to be fed to the decoder, while during inference, we want the output of the decoder in time-step (t) to be passed as the input to the decoder in time step (t+1).

    For training: we use the helper function: seq2seq.TrainingHelper(inputs, sequence_length), which just read inputs.

    For inference: we call the helper function: seq2seq.GreedyEmbeddingHelper() or seqseq.SampleEmbeddingHelper(), which differs whether it to use argmax() or sampling(from a distribution) of the outputs and passes the result through an embedding layer to get the next input.

Putting together: the Seq2Seq model

  1. Get the encoder state from the encoder layer and passed it as a initial_state to the decoder.
  2. Get the outputs of decoder train and decoder inference using seq2seq.dynamic_decoder(). When your calling both the methods make sure the weights are shared. (Use variable_scope to reuse the weights)
  3. Then train the network using the loss function seq2seq.sequence_loss.

An example code is given here and here.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!