recurrent-neural-network

Batch Normalization in tensorflow

帅比萌擦擦* 提交于 2019-12-09 04:54:58
问题 I noticed there are batch normalization functions already in the api for tensorflow. One thing I don't understand though, is how to to change the procedure between training and test? Batch normalization acts differently during test than during training. Specifically one uses a fixed mean and variance during training. Is there some good example code somewhere? I saw some, but with scope variables it got confusing 回答1: You are right, the tf.nn.batch_normalization provides just the basic

Siamese Model with LSTM network fails to train using tensorflow

 ̄綄美尐妖づ 提交于 2019-12-09 03:36:27
Dataset Description The dataset contains a set of question pairs and a label which tells if the questions are same. e.g. "How do I read and find my YouTube comments?" , "How can I see all my Youtube comments?" , "1" The goal of the model is to identify if the given question pair is same or different. Approach I have created a Siamese network to identify if two questions are same. Following is the model: graph = tf.Graph() with graph.as_default(): embedding_placeholder = tf.placeholder(tf.float32, shape=embedding_matrix.shape, name='embedding_placeholder') with tf.variable_scope('siamese

How to generate/read sparse sequence labels for CTC loss within Tensorflow?

北慕城南 提交于 2019-12-08 07:26:33
From a list of word images with their transcriptions, I am trying to create and read sparse sequence labels (for tf.nn.ctc_loss ) using a tf.train.slice_input_producer , avoiding serializing pre-packaged training data to disk in TFRecord format the apparent limitations of tf.py_func , any unnecessary or premature padding, and reading the entire data set to RAM. The main issue seems to be converting a string to the sequence of labels (a SparseTensor ) needed for tf.nn.ctc_loss . For example, with the character set in the (ordered) range [A-Z] , I'd want to convert the text label string "BAD" to

Siamese Model with LSTM network fails to train using tensorflow

十年热恋 提交于 2019-12-08 06:41:04
问题 Dataset Description The dataset contains a set of question pairs and a label which tells if the questions are same. e.g. "How do I read and find my YouTube comments?" , "How can I see all my Youtube comments?" , "1" The goal of the model is to identify if the given question pair is same or different. Approach I have created a Siamese network to identify if two questions are same. Following is the model: graph = tf.Graph() with graph.as_default(): embedding_placeholder = tf.placeholder(tf

When to stop training neural networks?

自古美人都是妖i 提交于 2019-12-06 08:12:43
问题 I'm trying to carry out a domain-specific classification research using RNN and have accumulated tens of millions of texts. Since it takes days and even months to run the whole dataset over, I only picked a small portion of it for testing, say 1M texts (80% for training, 20% for validation). I pre-trained the whole corpus with word vectorization and I also applied Dropout to the model to avoid over-fitting. When it trained 60000 text within 12 hrs, the loss had already dropped to a fairly low

Neural Machine Translation model predictions are off-by-one

别说谁变了你拦得住时间么 提交于 2019-12-06 07:19:57
问题 Problem Summary In the following example, my NMT model has high loss because it correctly predicts target_input instead of target_output . Targetin : 1 3 3 3 3 6 6 6 9 7 7 7 4 4 4 4 4 9 9 10 10 10 3 3 10 10 3 10 3 3 10 10 3 9 9 4 4 4 4 4 3 10 3 3 9 9 3 6 6 6 6 6 6 10 9 9 10 10 4 4 4 4 4 4 4 4 4 4 4 4 9 9 9 9 3 3 3 6 6 6 6 6 9 9 10 3 4 4 4 4 4 4 4 4 4 4 4 4 9 9 10 3 10 9 9 3 4 4 4 4 4 4 4 4 4 10 10 4 4 4 4 4 4 4 4 4 4 9 9 10 3 6 6 6 6 3 3 3 10 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 9 9 3 3 10 6 6 6 6

How to handle padding when using sequence_length parameter in TensorFlow dynamic_rnn

左心房为你撑大大i 提交于 2019-12-06 06:38:12
问题 I'm trying to use the dynamic_rnn function in Tensorflow to speed up training. After doing some reading, my understanding is that one way to speed up training is to explicitly pass a value to the sequence_length parameter in this function. After a bit more reading, and finding this SO explanation, it seems like what I need to pass is a vector (maybe defined by a tf.placeholder ) that contains the length of each sequence within a batch. Here's where I'm confused: in order to take advantage of

predict exponential weighted average using a simple rnn

≯℡__Kan透↙ 提交于 2019-12-06 06:33:57
问题 In an attempt to further explore the keras-tf RNN capabilities and different parameters, i decided to solve a toy problem as described - build a source data set composed of a sequence of random numbers build a "labels" data set comprised of the EWMA formula performed on the source dataset. The idea behind it is that EWMA has a very clear and simple definition of how it uses the "history" of the sequence - EWMA t = (1-alpha)*average t-1 + alpha*x t My assumption is, that when looking at a

Tensorflow RNN input size

大兔子大兔子 提交于 2019-12-06 06:25:39
I am trying to use tensorflow to create a recurrent neural network. My code is something like this: import tensorflow as tf rnn_cell = tf.nn.rnn_cell.GRUCell(3) inputs = [tf.constant([[0, 1]], dtype=tf.float32), tf.constant([[2, 3]], dtype=tf.float32)] outputs, end = tf.nn.rnn(rnn_cell, inputs, dtype=tf.float32) Now, everything runs just fine. However, I am rather confused by what is actually going on. The output dimensions are always the batch size x the size of the rnn cell's hidden state - how can they be completely independent of the input size? If my understanding is correct, the inputs

Format time-series data for short term forecasting using Recurrent Neural networks

穿精又带淫゛_ 提交于 2019-12-06 05:37:14
I want to forecast day-ahead power consumption using recurrent neural networks (RNN). But, I find the required data format (samples, timesteps, features) for RNN as confusing. Let me explain with an example as: I have power_dataset.csv on dropbox, which contains power consumption from 5 June to 18 June at 10 minutely rate (144 observations per day). Now, to check the performance of RNN using rnn R package, I am following these steps train model M for the usage of 17 June by using data from 5-16 June predict usage of 18 June by using M and updated usage from 6-17 June My understanding of RNN