recurrent-neural-network

Stateful LSTM: When to reset states?

岁酱吖の 提交于 2019-11-29 02:19:42
Given X with dimensions (m samples, n sequences, and k features) , and y labels with dimensions (m samples, 0/1) : Suppose I want to train a stateful LSTM (going by keras definition, where "stateful = True" means that cell states are not reset between sequences per sample -- please correct me if I'm wrong!), are states supposed to be reset on a per epoch basis or per sample basis? Example: for e in epoch: for m in X.shape[0]: #for each sample for n in X.shape[1]: #for each sequence #train_on_batch for model... #model.reset_states() (1) I believe this is 'stateful = False'? #model.reset_states(

Time Series Prediction via Neural Networks

谁说胖子不能爱 提交于 2019-11-28 17:05:52
I have been working on Neural Networks for various purposes lately. I have had great success in digit recognition, XOR, and various other easy/hello world'ish applications. I would like to tackle the domain of time series estimation. I do not have a University account at the moment to read all the IEEE/ACM papers on the topic (for free), nor can I find many resources detailing using ANN for time series forcasting. I would like to know if anyone has any suggestions or can recommend any resources concerning using ANN for forcasting via time series data? I would assume that to train the NN, you

How to construct input data to LSTM for time series multi-step horizon with external features?

試著忘記壹切 提交于 2019-11-28 16:53:19
问题 I'm trying to use LSTM to do store sales forecast. Here is how my raw data look like: | Date | StoreID | Sales | Temperature | Open | StoreType | |------------|---------|-------|-------------|---------|-----------| | 01/01/2016 | 1 | 0 | 36 | 0 | 1 | | 01/02/2016 | 1 | 10100 | 42 | 1 | 1 | | ... | 12/31/2016 | 1 | 14300 | 39 | 1 | 1 | | 01/01/2016 | 2 | 25000 | 46 | 1 | 3 | | 01/02/2016 | 2 | 23700 | 43 | 1 | 3 | | ... | 12/31/2016 | 2 | 20600 | 37 | 1 | 3 | | ... | 12/31/2016 | 10 | 19800 |

why do we “pack” the sequences in pytorch?

强颜欢笑 提交于 2019-11-28 16:39:15
I was trying to replicate How to use packing for variable-length sequence inputs for rnn but I guess I first need to understand why we need to "pack" the sequence. I understand why we need to "pad" them but why is "packing" ( through pack_padded_sequence ) necessary? Any high-level explanation would be appreciated! Umang Gupta I have stumbled upon this problem too and below is what I figured out. When training RNN (LSTM or GRU or vanilla-RNN), it is difficult to batch the variable length sequences. For ex: if length of sequences in a size 8 batch is [4,6,8,5,4,3,7,8], you will pad all the

How to use return_sequences option and TimeDistributed layer in Keras?

梦想的初衷 提交于 2019-11-28 14:53:56
问题 I have a dialog corpus like below. And I want to implement a LSTM model which predicts a system action. The system action is described as a bit vector. And a user input is calculated as a word-embedding which is also a bit vector. t1: user: "Do you know an apple?", system: "no"(action=2) t2: user: "xxxxxx", system: "yyyy" (action=0) t3: user: "aaaaaa", system: "bbbb" (action=5) So what I want to realize is "many to many (2)" model. When my model receives a user input, it must output a system

What is the intuition of using tanh in LSTM

老子叫甜甜 提交于 2019-11-28 13:45:29
问题 In LSTM Network (Understanding LSTMs), Why input gate and output gate use tanh? what is the intuition behind this? it is just a nonlinear transformation? if it is, can I change both to another activation function (e.g. ReLU)? 回答1: Sigmoid specifically, is used as the gating function for the 3 gates(in, out, forget) in LSTM , since it outputs a value between 0 and 1, it can either let no flow or complete flow of information throughout the gates. On the other hand, to overcome the vanishing

Tensorflow TypeError: Fetch argument None has invalid type <type 'NoneType'>?

岁酱吖の 提交于 2019-11-28 10:46:57
I'm building a RNN loosely based on the TensorFlow tutorial . The relevant parts of my model are as follows: input_sequence = tf.placeholder(tf.float32, [BATCH_SIZE, TIME_STEPS, PIXEL_COUNT + AUX_INPUTS]) output_actual = tf.placeholder(tf.float32, [BATCH_SIZE, OUTPUT_SIZE]) lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(CELL_SIZE, state_is_tuple=False) stacked_lstm = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * CELL_LAYERS, state_is_tuple=False) initial_state = state = stacked_lstm.zero_state(BATCH_SIZE, tf.float32) outputs = [] with tf.variable_scope("LSTM"): for step in xrange(TIME_STEPS): if step >

How to deal with batches with variable-length sequences in TensorFlow?

好久不见. 提交于 2019-11-28 03:24:20
I was trying to use an RNN (specifically, LSTM) for sequence prediction. However, I ran into an issue with variable sequence lengths. For example, sent_1 = "I am flying to Dubain" sent_2 = "I was traveling from US to Dubai" I am trying to predicting the next word after the current one with a simple RNN based on this Benchmark for building a PTB LSTM model . However, the num_steps parameter (used for unrolling to the previous hidden states), should remain the same in each Tensorflow's epoch. Basically, batching sentences is not possible as the sentences vary in length. # inputs = [tf.squeeze

How can I feed last output y(t-1) as input for generating y(t) in tensorflow RNN?

六眼飞鱼酱① 提交于 2019-11-28 03:24:05
问题 I want to design a single layer RNN in Tensorflow such that last output (y(t-1)) is participated in updating the hidden state. h(t) = tanh(W_{ih} * x(t) + W_{hh} * h(t) + **W_{oh}y(t - 1)**) y(t) = W_{ho}*h(t) How can I feed last input y(t - 1) as input for updating the hidden state? 回答1: Is y(t-1) the last input or output? In both cases it is not a straight fit with the TensorFlow RNN cell abstraction. If your RNN is simple you can just write the loop on your own, then you have full control.

How do I create a variable-length input LSTM in Keras?

一世执手 提交于 2019-11-28 03:20:21
I am trying to do some vanilla pattern recognition with an LSTM using Keras to predict the next element in a sequence. My data look like this: where the label of the training sequence is the last element in the list: X_train['Sequence'][n][-1] . Because my Sequence column can have a variable number of elements in the sequence, I believe an RNN to be the best model to use. Below is my attempt to build an LSTM in Keras: # Build the model # A few arbitrary constants... max_features = 20000 out_size = 128 # The max length should be the length of the longest sequence (minus one to account for the