recurrent-neural-network

How to calculate perplexity of RNN in tensorflow

放肆的年华 提交于 2019-11-27 13:29:33
问题 I'm running the word RNN implmentation of tensor flow of Word RNN How to calculate the perplexity of RNN. Following is the code in training that shows training loss and other things in each epoch: for e in range(model.epoch_pointer.eval(), args.num_epochs): sess.run(tf.assign(model.lr, args.learning_rate * (args.decay_rate ** e))) data_loader.reset_batch_pointer() state = sess.run(model.initial_state) speed = 0 if args.init_from is None: assign_op = model.batch_pointer.assign(0) sess.run

Get the last output of a dynamic_rnn in TensorFlow

丶灬走出姿态 提交于 2019-11-27 12:28:42
I have a 3-D tensor of shape [batch, None, dim] where the second dimension, i.e. the timesteps, is unknown. I use dynamic_rnn to process such input, like in the following snippet: import numpy as np import tensorflow as tf batch = 2 dim = 3 hidden = 4 lengths = tf.placeholder(dtype=tf.int32, shape=[batch]) inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim]) cell = tf.nn.rnn_cell.GRUCell(hidden) cell_state = cell.zero_state(batch, tf.float32) output, _ = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state) Actually, running this snipped with some actual numbers, I

LSTM module for Caffe

允我心安 提交于 2019-11-27 04:32:12
Does anyone know if there exists a nice LSTM module for Caffe? I found one from a github account by russel91 but apparantly the webpage containing examples and explanations disappeared (Formerly http://apollo.deepmatter.io/ --> it now redirects only to the github page which has no examples or explanations anymore). I know Jeff Donahue worked on LSTM models using Caffe. He also gave a nice tutorial during CVPR 2015. He has a pull-request with RNN and LSTM. Update : there is a new PR by Jeff Donahue including RNN and LSTM. This PR was merged on June 2016 to master. Shai In fact, training

Get the last output of a dynamic_rnn in TensorFlow

怎甘沉沦 提交于 2019-11-27 04:03:54
问题 I have a 3-D tensor of shape [batch, None, dim] where the second dimension, i.e. the timesteps, is unknown. I use dynamic_rnn to process such input, like in the following snippet: import numpy as np import tensorflow as tf batch = 2 dim = 3 hidden = 4 lengths = tf.placeholder(dtype=tf.int32, shape=[batch]) inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim]) cell = tf.nn.rnn_cell.GRUCell(hidden) cell_state = cell.zero_state(batch, tf.float32) output, _ = tf.nn.dynamic_rnn(cell,

How to deal with batches with variable-length sequences in TensorFlow?

做~自己de王妃 提交于 2019-11-27 00:02:22
问题 I was trying to use an RNN (specifically, LSTM) for sequence prediction. However, I ran into an issue with variable sequence lengths. For example, sent_1 = "I am flying to Dubain" sent_2 = "I was traveling from US to Dubai" I am trying to predicting the next word after the current one with a simple RNN based on this Benchmark for building a PTB LSTM model. However, the num_steps parameter (used for unrolling to the previous hidden states), should remain the same in each Tensorflow's epoch.

why do we “pack” the sequences in pytorch?

怎甘沉沦 提交于 2019-11-27 00:01:45
问题 I was trying to replicate How to use packing for variable-length sequence inputs for rnn but I guess I first need to understand why we need to "pack" the sequence. I understand why we need to "pad" them but why is "packing" ( through pack_padded_sequence ) necessary? Any high-level explanation would be appreciated! 回答1: I have stumbled upon this problem too and below is what I figured out. When training RNN (LSTM or GRU or vanilla-RNN), it is difficult to batch the variable length sequences.

Tensorflow: How to pass output from previous time-step as input to next timestep

淺唱寂寞╮ 提交于 2019-11-26 23:03:35
问题 It is a duplicate of this question How can I feed last output y(t-1) as input for generating y(t) in tensorflow RNN? I want to pass the output of RNN at time-step T as the input at time-step T+1. input_RNN(T+1) = output_RNN(T) As per the documentation, the tf.nn.rnn as well as tf.nn.dynamic_rnn functions explicitly take the complete input to all time-steps. I checked the seq2seq example at https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/seq2seq.py It uses a loop and

Is RNN initial state reset for subsequent mini-batches?

元气小坏坏 提交于 2019-11-26 19:06:27
问题 Could someone please clarify whether the initial state of the RNN in TF is reset for subsequent mini-batches, or the last state of the previous mini-batch is used as mentioned in Ilya Sutskever et al., ICLR 2015 ? 回答1: The tf.nn.dynamic_rnn() or tf.nn.rnn() operations allow to specify the initial state of the RNN using the initial_state parameter. If you don't specify this parameter, the hidden states will be initialized to zero vectors at the beginning of each training batch. In TensorFlow,

Error when checking model input: expected lstm_1_input to have 3 dimensions, but got array with shape (339732, 29)

对着背影说爱祢 提交于 2019-11-26 18:16:23
问题 My input is simply a csv file with 339732 rows and two columns : the first being 29 feature values, i.e. X the second being a binary label value, i.e. Y I am trying to train my data on a stacked LSTM model: data_dim = 29 timesteps = 8 num_classes = 2 model = Sequential() model.add(LSTM(30, return_sequences=True, input_shape=(timesteps, data_dim))) # returns a sequence of vectors of dimension 30 model.add(LSTM(30, return_sequences=True)) # returns a sequence of vectors of dimension 30 model

LSTM module for Caffe

匆匆过客 提交于 2019-11-26 11:14:36
问题 Does anyone know if there exists a nice LSTM module for Caffe? I found one from a github account by russel91 but apparantly the webpage containing examples and explanations disappeared (Formerly http://apollo.deepmatter.io/ --> it now redirects only to the github page which has no examples or explanations anymore). 回答1: I know Jeff Donahue worked on LSTM models using Caffe. He also gave a nice tutorial during CVPR 2015. He has a pull-request with RNN and LSTM. Update : there is a new PR by