recurrent-neural-network

Stateful LSTM: When to reset states?

阅读更多关于 Stateful LSTM: When to reset states?

Given X with dimensions (m samples, n sequences, and k features) , and y labels with dimensions (m samples, 0/1) : Suppose I want to train a stateful LSTM (going by keras definition, where "stateful = True" means that cell states are not reset between sequences per sample -- please correct me if I'm wrong!), are states supposed to be reset on a per epoch basis or per sample basis? Example: for e in epoch: for m in X.shape[0]: #for each sample for n in X.shape[1]: #for each sequence #train_on_batch for model... #model.reset_states() (1) I believe this is 'stateful = False'? #model.reset_states(

Time Series Prediction via Neural Networks

阅读更多关于 Time Series Prediction via Neural Networks

I have been working on Neural Networks for various purposes lately. I have had great success in digit recognition, XOR, and various other easy/hello world'ish applications. I would like to tackle the domain of time series estimation. I do not have a University account at the moment to read all the IEEE/ACM papers on the topic (for free), nor can I find many resources detailing using ANN for time series forcasting. I would like to know if anyone has any suggestions or can recommend any resources concerning using ANN for forcasting via time series data? I would assume that to train the NN, you

How to construct input data to LSTM for time series multi-step horizon with external features?

阅读更多关于 How to construct input data to LSTM for time series multi-step horizon with external features?

问题 I'm trying to use LSTM to do store sales forecast. Here is how my raw data look like: | Date | StoreID | Sales | Temperature | Open | StoreType | |------------|---------|-------|-------------|---------|-----------| | 01/01/2016 | 1 | 0 | 36 | 0 | 1 | | 01/02/2016 | 1 | 10100 | 42 | 1 | 1 | | ... | 12/31/2016 | 1 | 14300 | 39 | 1 | 1 | | 01/01/2016 | 2 | 25000 | 46 | 1 | 3 | | 01/02/2016 | 2 | 23700 | 43 | 1 | 3 | | ... | 12/31/2016 | 2 | 20600 | 37 | 1 | 3 | | ... | 12/31/2016 | 10 | 19800 |

why do we “pack” the sequences in pytorch?

阅读更多关于 why do we “pack” the sequences in pytorch?

I was trying to replicate How to use packing for variable-length sequence inputs for rnn but I guess I first need to understand why we need to "pack" the sequence. I understand why we need to "pad" them but why is "packing" ( through pack_padded_sequence ) necessary? Any high-level explanation would be appreciated! Umang Gupta I have stumbled upon this problem too and below is what I figured out. When training RNN (LSTM or GRU or vanilla-RNN), it is difficult to batch the variable length sequences. For ex: if length of sequences in a size 8 batch is [4,6,8,5,4,3,7,8], you will pad all the

How to use return_sequences option and TimeDistributed layer in Keras?

阅读更多关于 How to use return_sequences option and TimeDistributed layer in Keras?

问题 I have a dialog corpus like below. And I want to implement a LSTM model which predicts a system action. The system action is described as a bit vector. And a user input is calculated as a word-embedding which is also a bit vector. t1: user: "Do you know an apple?", system: "no"(action=2) t2: user: "xxxxxx", system: "yyyy" (action=0) t3: user: "aaaaaa", system: "bbbb" (action=5) So what I want to realize is "many to many (2)" model. When my model receives a user input, it must output a system

What is the intuition of using tanh in LSTM

阅读更多关于 What is the intuition of using tanh in LSTM

问题 In LSTM Network (Understanding LSTMs), Why input gate and output gate use tanh? what is the intuition behind this? it is just a nonlinear transformation? if it is, can I change both to another activation function (e.g. ReLU)? 回答1: Sigmoid specifically, is used as the gating function for the 3 gates(in, out, forget) in LSTM , since it outputs a value between 0 and 1, it can either let no flow or complete flow of information throughout the gates. On the other hand, to overcome the vanishing

Tensorflow TypeError: Fetch argument None has invalid type <type 'NoneType'>?

阅读更多关于 Tensorflow TypeError: Fetch argument None has invalid type ?

I'm building a RNN loosely based on the TensorFlow tutorial . The relevant parts of my model are as follows: input_sequence = tf.placeholder(tf.float32, [BATCH_SIZE, TIME_STEPS, PIXEL_COUNT + AUX_INPUTS]) output_actual = tf.placeholder(tf.float32, [BATCH_SIZE, OUTPUT_SIZE]) lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(CELL_SIZE, state_is_tuple=False) stacked_lstm = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * CELL_LAYERS, state_is_tuple=False) initial_state = state = stacked_lstm.zero_state(BATCH_SIZE, tf.float32) outputs = [] with tf.variable_scope("LSTM"): for step in xrange(TIME_STEPS): if step >

How to deal with batches with variable-length sequences in TensorFlow?

阅读更多关于 How to deal with batches with variable-length sequences in TensorFlow?

I was trying to use an RNN (specifically, LSTM) for sequence prediction. However, I ran into an issue with variable sequence lengths. For example, sent_1 = "I am flying to Dubain" sent_2 = "I was traveling from US to Dubai" I am trying to predicting the next word after the current one with a simple RNN based on this Benchmark for building a PTB LSTM model . However, the num_steps parameter (used for unrolling to the previous hidden states), should remain the same in each Tensorflow's epoch. Basically, batching sentences is not possible as the sentences vary in length. # inputs = [tf.squeeze

How can I feed last output y(t-1) as input for generating y(t) in tensorflow RNN?

阅读更多关于 How can I feed last output y(t-1) as input for generating y(t) in tensorflow RNN?

问题 I want to design a single layer RNN in Tensorflow such that last output (y(t-1)) is participated in updating the hidden state. h(t) = tanh(W_{ih} * x(t) + W_{hh} * h(t) + **W_{oh}y(t - 1)**) y(t) = W_{ho}*h(t) How can I feed last input y(t - 1) as input for updating the hidden state? 回答1: Is y(t-1) the last input or output? In both cases it is not a straight fit with the TensorFlow RNN cell abstraction. If your RNN is simple you can just write the loop on your own, then you have full control.

How do I create a variable-length input LSTM in Keras?

阅读更多关于 How do I create a variable-length input LSTM in Keras?

I am trying to do some vanilla pattern recognition with an LSTM using Keras to predict the next element in a sequence. My data look like this: where the label of the training sequence is the last element in the list: X_train['Sequence'][n][-1] . Because my Sequence column can have a variable number of elements in the sequence, I believe an RNN to be the best model to use. Below is my attempt to build an LSTM in Keras: # Build the model # A few arbitrary constants... max_features = 20000 out_size = 128 # The max length should be the length of the longest sequence (minus one to account for the