recurrent-neural-network

Batch Normalization in tensorflow

阅读更多关于 Batch Normalization in tensorflow

问题 I noticed there are batch normalization functions already in the api for tensorflow. One thing I don't understand though, is how to to change the procedure between training and test? Batch normalization acts differently during test than during training. Specifically one uses a fixed mean and variance during training. Is there some good example code somewhere? I saw some, but with scope variables it got confusing 回答1: You are right, the tf.nn.batch_normalization provides just the basic

Siamese Model with LSTM network fails to train using tensorflow

阅读更多关于 Siamese Model with LSTM network fails to train using tensorflow

Dataset Description The dataset contains a set of question pairs and a label which tells if the questions are same. e.g. "How do I read and find my YouTube comments?" , "How can I see all my Youtube comments?" , "1" The goal of the model is to identify if the given question pair is same or different. Approach I have created a Siamese network to identify if two questions are same. Following is the model: graph = tf.Graph() with graph.as_default(): embedding_placeholder = tf.placeholder(tf.float32, shape=embedding_matrix.shape, name='embedding_placeholder') with tf.variable_scope('siamese

How to generate/read sparse sequence labels for CTC loss within Tensorflow?

阅读更多关于 How to generate/read sparse sequence labels for CTC loss within Tensorflow?

From a list of word images with their transcriptions, I am trying to create and read sparse sequence labels (for tf.nn.ctc_loss ) using a tf.train.slice_input_producer , avoiding serializing pre-packaged training data to disk in TFRecord format the apparent limitations of tf.py_func , any unnecessary or premature padding, and reading the entire data set to RAM. The main issue seems to be converting a string to the sequence of labels (a SparseTensor ) needed for tf.nn.ctc_loss . For example, with the character set in the (ordered) range [A-Z] , I'd want to convert the text label string "BAD" to

Siamese Model with LSTM network fails to train using tensorflow

阅读更多关于 Siamese Model with LSTM network fails to train using tensorflow

问题 Dataset Description The dataset contains a set of question pairs and a label which tells if the questions are same. e.g. "How do I read and find my YouTube comments?" , "How can I see all my Youtube comments?" , "1" The goal of the model is to identify if the given question pair is same or different. Approach I have created a Siamese network to identify if two questions are same. Following is the model: graph = tf.Graph() with graph.as_default(): embedding_placeholder = tf.placeholder(tf

When to stop training neural networks?

阅读更多关于 When to stop training neural networks?

问题 I'm trying to carry out a domain-specific classification research using RNN and have accumulated tens of millions of texts. Since it takes days and even months to run the whole dataset over, I only picked a small portion of it for testing, say 1M texts (80% for training, 20% for validation). I pre-trained the whole corpus with word vectorization and I also applied Dropout to the model to avoid over-fitting. When it trained 60000 text within 12 hrs, the loss had already dropped to a fairly low

Neural Machine Translation model predictions are off-by-one

阅读更多关于 Neural Machine Translation model predictions are off-by-one

问题 Problem Summary In the following example, my NMT model has high loss because it correctly predicts target_input instead of target_output . Targetin : 1 3 3 3 3 6 6 6 9 7 7 7 4 4 4 4 4 9 9 10 10 10 3 3 10 10 3 10 3 3 10 10 3 9 9 4 4 4 4 4 3 10 3 3 9 9 3 6 6 6 6 6 6 10 9 9 10 10 4 4 4 4 4 4 4 4 4 4 4 4 9 9 9 9 3 3 3 6 6 6 6 6 9 9 10 3 4 4 4 4 4 4 4 4 4 4 4 4 9 9 10 3 10 9 9 3 4 4 4 4 4 4 4 4 4 10 10 4 4 4 4 4 4 4 4 4 4 9 9 10 3 6 6 6 6 3 3 3 10 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 9 9 3 3 10 6 6 6 6

How to handle padding when using sequence_length parameter in TensorFlow dynamic_rnn

阅读更多关于 How to handle padding when using sequence_length parameter in TensorFlow dynamic_rnn

问题 I'm trying to use the dynamic_rnn function in Tensorflow to speed up training. After doing some reading, my understanding is that one way to speed up training is to explicitly pass a value to the sequence_length parameter in this function. After a bit more reading, and finding this SO explanation, it seems like what I need to pass is a vector (maybe defined by a tf.placeholder ) that contains the length of each sequence within a batch. Here's where I'm confused: in order to take advantage of

predict exponential weighted average using a simple rnn

阅读更多关于 predict exponential weighted average using a simple rnn

问题 In an attempt to further explore the keras-tf RNN capabilities and different parameters, i decided to solve a toy problem as described - build a source data set composed of a sequence of random numbers build a "labels" data set comprised of the EWMA formula performed on the source dataset. The idea behind it is that EWMA has a very clear and simple definition of how it uses the "history" of the sequence - EWMA t = (1-alpha)*average t-1 + alpha*x t My assumption is, that when looking at a

Tensorflow RNN input size

阅读更多关于 Tensorflow RNN input size

I am trying to use tensorflow to create a recurrent neural network. My code is something like this: import tensorflow as tf rnn_cell = tf.nn.rnn_cell.GRUCell(3) inputs = [tf.constant([[0, 1]], dtype=tf.float32), tf.constant([[2, 3]], dtype=tf.float32)] outputs, end = tf.nn.rnn(rnn_cell, inputs, dtype=tf.float32) Now, everything runs just fine. However, I am rather confused by what is actually going on. The output dimensions are always the batch size x the size of the rnn cell's hidden state - how can they be completely independent of the input size? If my understanding is correct, the inputs

Format time-series data for short term forecasting using Recurrent Neural networks

阅读更多关于 Format time-series data for short term forecasting using Recurrent Neural networks

I want to forecast day-ahead power consumption using recurrent neural networks (RNN). But, I find the required data format (samples, timesteps, features) for RNN as confusing. Let me explain with an example as: I have power_dataset.csv on dropbox, which contains power consumption from 5 June to 18 June at 10 minutely rate (144 observations per day). Now, to check the performance of RNN using rnn R package, I am following these steps train model M for the usage of 17 June by using data from 5-16 June predict usage of 18 June by using M and updated usage from 6-17 June My understanding of RNN