发表新帖

发表新帖

Predicting the next word using the LSTM ptb model tensorflow example

前端未结

关注

 2  1530

我在风中等你 2020-12-09 20:06

I am trying to use the tensorflow LSTM model to make next word predictions.

As described in this related question (which has no accepted answer) the example contains

2条回答

长情又很酷 (楼主)

2020-12-09 21:05

I am implementing seq2seq model too.

So lets me try to explain with my understanding:

The outputs of your LSTM model is a list (with length num_steps) of 2D tensor of size [batch_size, size].

The code line:

output = tf.reshape(tf.concat(1, outputs), [-1, size])

will produce a new output which is a 2D tensor of size [batch_size x num_steps, size].

For your case, batch_size = 1 and num_steps = 20 --> output shape is [20, size].

Code line:

logits = tf.nn.xw_plus_b(output, tf.get_variable("softmax_w", [size, vocab_size]), tf.get_variable("softmax_b", [vocab_size]))

<=> output[batch_size x num_steps, size] x softmax_w[size, vocab_size] will output logits of size [batch_size x num_steps, vocab_size].
For your case, logits of size [20, vocab_size] --> probs tensor has same size as logits by [20, vocab_size].

Code line:

chosen_word = np.argmax(probs, 1)

will output chosen_word tensor of size [20, 1] with each value is the next prediction word index of current word.

Code line:

loss = seq2seq.sequence_loss_by_example([logits], [tf.reshape(self._targets, [-1])], [tf.ones([batch_size * num_steps])])

is to compute the softmax cross entropy loss for batch_size of sequences.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

热议问题