lstm

What is the difference between CuDNNLSTM and LSTM in Keras?

不想你离开。 提交于 2019-12-02 18:02:31
In Keras , the high-level deep learning library, there are multiple types of recurrent layers; these include LSTM (Long short term memory) and CuDNNLSTM . According to the Keras documentation , a CuDNNLSTM is a: Fast LSTM implementation backed by CuDNN. Can only be run on GPU, with the TensorFlow backend. It is my belief that Keras automatically uses the GPU wherever possible. According to the TensorFlow build instructions , to have a working TensorFlow GPU backend, you will need CuDNN: The following NVIDIA software must be installed on your system: NVIDIA's Cuda Toolkit (>= 7.0). We recommend

Understanding Keras LSTMs: Role of Batch-size and Statefulness

陌路散爱 提交于 2019-12-02 17:46:54
Sources There are several sources out there explaining stateful / stateless LSTMs and the role of batch_size which I've read already. I'll refer to them later in my post: [ 1 ] https://machinelearningmastery.com/understanding-stateful-lstm-recurrent-neural-networks-python-keras/ [ 2 ] https://machinelearningmastery.com/stateful-stateless-lstm-time-series-forecasting-python/ [ 3 ] http://philipperemy.github.io/keras-stateful-lstm/ [ 4 ] https://machinelearningmastery.com/use-different-batch-sizes-training-predicting-python-keras/ Ans also other SO threads like Understanding Keras LSTMs and

Use LSTM tutorial code to predict next word in a sentence?

三世轮回 提交于 2019-12-02 17:35:53
I've been trying to understand the sample code with https://www.tensorflow.org/tutorials/recurrent which you can find at https://github.com/tensorflow/models/blob/master/tutorials/rnn/ptb/ptb_word_lm.py (Using tensorflow 1.3.0.) I've summarized (what I think are) the key parts, for my question, below: size = 200 vocab_size = 10000 layers = 2 # input_.input_data is a 2D tensor [batch_size, num_steps] of # word ids, from 1 to 10000 cell = tf.contrib.rnn.MultiRNNCell( [tf.contrib.rnn.BasicLSTMCell(size) for _ in range(2)] ) embedding = tf.get_variable( "embedding", [vocab_size, size], dtype=tf

Shuffling training data with LSTM RNN

三世轮回 提交于 2019-12-02 17:34:06
Since an LSTM RNN uses previous events to predict current sequences, why do we shuffle the training data? Don't we lose the temporal ordering of the training data? How is it still effective at making predictions after being trained on shuffled training data? In general, when you shuffle the training data (a set of sequences), you shuffle the order in which sequences are fed to the RNN, you don't shuffle the ordering within individual sequences. This is fine to do when your network is stateless: Stateless Case: The network's memory only persists for the duration of a sequence. Training on

Tensorflow dynamic RNN (LSTM): how to format input?

僤鯓⒐⒋嵵緔 提交于 2019-12-02 17:21:14
I have been given some data of this format and the following details: person1, day1, feature1, feature2, ..., featureN, label person1, day2, feature1, feature2, ..., featureN, label ... person1, dayN, feature1, feature2, ..., featureN, label person2, day1, feature1, feature2, ..., featureN, label person2, day2, feature1, feature2, ..., featureN, label ... person2, dayN, feature1, feature2, ..., featureN, label ... there is always the same number of features but each feature might be a 0 representing nothing there is a varying amount of days available for each person, e.g. person1 has 20 days

Tensorflow LSTM Regularization

故事扮演 提交于 2019-12-02 17:19:56
问题 I was wondering how one can implement l1 or l2 regularization within an LSTM in TensorFlow? TF doesn't give you access to the internal weights of the LSTM, so I'm not certain how one can calculate the norms and add it to the loss. My loss function is just RMS for now. The answers here don't seem to suffice. 回答1: The answers in the link you mentioned are the correct way to do it. Iterate through tf.trainable_variables and find the variables associated with your LSTM. An alternative, more

What's the difference between a bidirectional LSTM and an LSTM?

社会主义新天地 提交于 2019-12-02 17:17:11
Can someone please explain this? I know bidirectional LSTMs have a forward and backward pass but what is the advantage of this over a unidirectional LSTM? What is each of them better suited for? LSTM in its core, preserves information from inputs that has already passed through it using the hidden state. Unidirectional LSTM only preserves information of the past because the only inputs it has seen are from the past. Using bidirectional will run your inputs in two ways, one from past to future and one from future to past and what differs this approach from unidirectional is that in the LSTM

How to correctly give inputs to Embedding, LSTM and Linear layers in PyTorch?

余生颓废 提交于 2019-12-02 16:46:45
I need some clarity on how to correctly prepare inputs for batch-training using different components of the torch.nn module. Specifically, I'm looking to create an encoder-decoder network for a seq2seq model. Suppose I have a module with these three layers, in order: nn.Embedding nn.LSTM nn.Linear nn.Embedding Input: batch_size * seq_length Output: batch_size * seq_length * embedding_dimension I don't have any problems here, I just want to be explicit about the expected shape of the input and output. nn.LSTM Input: seq_length * batch_size * input_size ( embedding_dimension in this case) Output

TF LSTM: Save State from training session for prediction session later

二次信任 提交于 2019-12-02 10:03:21
问题 I am trying to save the latest LSTM State from training to be reused during the prediction stage later. The problem I am encountering is that in the TF LSTM model the State is passed around from one training iteration to next via a combination of a placeholder and a numpy array -- neither of which seems to be included in the Graph by default when the session is saved. To work around this, I am creating a dedicated TF variable to hold the latest version of the state so as to add it to the

how to solve save and restore Keras LSTM model error

↘锁芯ラ 提交于 2019-12-02 09:35:01
I have trained a LSTM network to predict stock price.After train the model well ,When I was trying to save and reload the model and input new data to predict the stock price.I received an error. Process finished with exit code 0 This part is my code for training the data: CONST_TRAINING_SEQUENCE_LENGTH = 12 CONST_TESTING_CASES = 5 def dataNormalization(data): return [(datum - data[0]) / data[0] for datum in data] def dataDeNormalization(data, base): return [(datum + 1) * base for datum in data] def getDeepLearningData(ticker): # Step 1. Load data data = pandas.read_csv('./data/Intraday/' +