recurrent-neural-network

How to calculate perplexity of RNN in tensorflow

阅读更多关于 How to calculate perplexity of RNN in tensorflow

问题 I'm running the word RNN implmentation of tensor flow of Word RNN How to calculate the perplexity of RNN. Following is the code in training that shows training loss and other things in each epoch: for e in range(model.epoch_pointer.eval(), args.num_epochs): sess.run(tf.assign(model.lr, args.learning_rate * (args.decay_rate ** e))) data_loader.reset_batch_pointer() state = sess.run(model.initial_state) speed = 0 if args.init_from is None: assign_op = model.batch_pointer.assign(0) sess.run

Get the last output of a dynamic_rnn in TensorFlow

阅读更多关于 Get the last output of a dynamic_rnn in TensorFlow

I have a 3-D tensor of shape [batch, None, dim] where the second dimension, i.e. the timesteps, is unknown. I use dynamic_rnn to process such input, like in the following snippet: import numpy as np import tensorflow as tf batch = 2 dim = 3 hidden = 4 lengths = tf.placeholder(dtype=tf.int32, shape=[batch]) inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim]) cell = tf.nn.rnn_cell.GRUCell(hidden) cell_state = cell.zero_state(batch, tf.float32) output, _ = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state) Actually, running this snipped with some actual numbers, I

LSTM module for Caffe

阅读更多关于 LSTM module for Caffe

Does anyone know if there exists a nice LSTM module for Caffe? I found one from a github account by russel91 but apparantly the webpage containing examples and explanations disappeared (Formerly http://apollo.deepmatter.io/ --> it now redirects only to the github page which has no examples or explanations anymore). I know Jeff Donahue worked on LSTM models using Caffe. He also gave a nice tutorial during CVPR 2015. He has a pull-request with RNN and LSTM. Update : there is a new PR by Jeff Donahue including RNN and LSTM. This PR was merged on June 2016 to master. Shai In fact, training

Get the last output of a dynamic_rnn in TensorFlow

阅读更多关于 Get the last output of a dynamic_rnn in TensorFlow

问题 I have a 3-D tensor of shape [batch, None, dim] where the second dimension, i.e. the timesteps, is unknown. I use dynamic_rnn to process such input, like in the following snippet: import numpy as np import tensorflow as tf batch = 2 dim = 3 hidden = 4 lengths = tf.placeholder(dtype=tf.int32, shape=[batch]) inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim]) cell = tf.nn.rnn_cell.GRUCell(hidden) cell_state = cell.zero_state(batch, tf.float32) output, _ = tf.nn.dynamic_rnn(cell,

How to deal with batches with variable-length sequences in TensorFlow?

阅读更多关于 How to deal with batches with variable-length sequences in TensorFlow?

问题 I was trying to use an RNN (specifically, LSTM) for sequence prediction. However, I ran into an issue with variable sequence lengths. For example, sent_1 = "I am flying to Dubain" sent_2 = "I was traveling from US to Dubai" I am trying to predicting the next word after the current one with a simple RNN based on this Benchmark for building a PTB LSTM model. However, the num_steps parameter (used for unrolling to the previous hidden states), should remain the same in each Tensorflow's epoch.

why do we “pack” the sequences in pytorch?

阅读更多关于 why do we “pack” the sequences in pytorch?

问题 I was trying to replicate How to use packing for variable-length sequence inputs for rnn but I guess I first need to understand why we need to "pack" the sequence. I understand why we need to "pad" them but why is "packing" ( through pack_padded_sequence ) necessary? Any high-level explanation would be appreciated! 回答1: I have stumbled upon this problem too and below is what I figured out. When training RNN (LSTM or GRU or vanilla-RNN), it is difficult to batch the variable length sequences.

Tensorflow: How to pass output from previous time-step as input to next timestep

阅读更多关于 Tensorflow: How to pass output from previous time-step as input to next timestep

问题 It is a duplicate of this question How can I feed last output y(t-1) as input for generating y(t) in tensorflow RNN? I want to pass the output of RNN at time-step T as the input at time-step T+1. input_RNN(T+1) = output_RNN(T) As per the documentation, the tf.nn.rnn as well as tf.nn.dynamic_rnn functions explicitly take the complete input to all time-steps. I checked the seq2seq example at https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/seq2seq.py It uses a loop and

Is RNN initial state reset for subsequent mini-batches?

阅读更多关于 Is RNN initial state reset for subsequent mini-batches?

问题 Could someone please clarify whether the initial state of the RNN in TF is reset for subsequent mini-batches, or the last state of the previous mini-batch is used as mentioned in Ilya Sutskever et al., ICLR 2015 ? 回答1: The tf.nn.dynamic_rnn() or tf.nn.rnn() operations allow to specify the initial state of the RNN using the initial_state parameter. If you don't specify this parameter, the hidden states will be initialized to zero vectors at the beginning of each training batch. In TensorFlow,

Error when checking model input: expected lstm_1_input to have 3 dimensions, but got array with shape (339732, 29)

阅读更多关于 Error when checking model input: expected lstm_1_input to have 3 dimensions, but got array with shape (339732, 29)

问题 My input is simply a csv file with 339732 rows and two columns : the first being 29 feature values, i.e. X the second being a binary label value, i.e. Y I am trying to train my data on a stacked LSTM model: data_dim = 29 timesteps = 8 num_classes = 2 model = Sequential() model.add(LSTM(30, return_sequences=True, input_shape=(timesteps, data_dim))) # returns a sequence of vectors of dimension 30 model.add(LSTM(30, return_sequences=True)) # returns a sequence of vectors of dimension 30 model

LSTM module for Caffe

阅读更多关于 LSTM module for Caffe

问题 Does anyone know if there exists a nice LSTM module for Caffe? I found one from a github account by russel91 but apparantly the webpage containing examples and explanations disappeared (Formerly http://apollo.deepmatter.io/ --> it now redirects only to the github page which has no examples or explanations anymore). 回答1: I know Jeff Donahue worked on LSTM models using Caffe. He also gave a nice tutorial during CVPR 2015. He has a pull-request with RNN and LSTM. Update : there is a new PR by