lstm

Stateful LSTM: When to reset states?

拟墨画扇 提交于 2019-11-27 21:56:10
问题 Given X with dimensions (m samples, n sequences, and k features) , and y labels with dimensions (m samples, 0/1) : Suppose I want to train a stateful LSTM (going by keras definition, where "stateful = True" means that cell states are not reset between sequences per sample -- please correct me if I'm wrong!), are states supposed to be reset on a per epoch basis or per sample basis? Example: for e in epoch: for m in X.shape[0]: #for each sample for n in X.shape[1]: #for each sequence #train_on

How to interpret weights in a LSTM layer in Keras [closed]

给你一囗甜甜゛ 提交于 2019-11-27 21:19:53
问题 I'm currently training a recurrent neural network for weather forecasting, using a LSTM layer. The network itself is pretty simple and looks roughly like this: model = Sequential() model.add(LSTM(hidden_neurons, input_shape=(time_steps, feature_count), return_sequences=False)) model.add(Dense(feature_count)) model.add(Activation("linear")) The weights of the LSTM layer do have the following shapes: for weight in model.get_weights(): # weights from Dense layer omitted print(weight.shape) >

Understanding stateful LSTM

守給你的承諾、 提交于 2019-11-27 20:41:08
I'm going through this tutorial on RNNs/LSTMs and I'm having quite a hard time understanding stateful LSTMs. My questions are as follows : 1. Training batching size In the Keras docs on RNNs , I found out that the hidden state of the sample in i -th position within the batch will be fed as input hidden state for the sample in i -th position in the next batch. Does that mean that if we want to pass the hidden state from sample to sample we have to use batches of size 1 and therefore perform online gradient descent? Is there a way to pass the hidden state within a batch of size >1 and perform

Building Speech Dataset for LSTM binary classification

血红的双手。 提交于 2019-11-27 16:09:35
I'm trying to do binary LSTM classification using theano. I have gone through the example code however I want to build my own. I have a small set of "Hello" & "Goodbye" recordings that I am using. I preprocess these by extracting the MFCC features for them and saving these features in a text file. I have 20 speech files(10 each) and I am generating a text file for each word, so 20 text files that contains the MFCC features. Each file is a 13x56 matrix. My problem now is: How do I use this text file to train the LSTM? I am relatively new to this. I have gone through some literature on it as

Error when checking model input: expected lstm_1_input to have 3 dimensions, but got array with shape (339732, 29)

℡╲_俬逩灬. 提交于 2019-11-27 13:59:49
My input is simply a csv file with 339732 rows and two columns : the first being 29 feature values, i.e. X the second being a binary label value, i.e. Y I am trying to train my data on a stacked LSTM model: data_dim = 29 timesteps = 8 num_classes = 2 model = Sequential() model.add(LSTM(30, return_sequences=True, input_shape=(timesteps, data_dim))) # returns a sequence of vectors of dimension 30 model.add(LSTM(30, return_sequences=True)) # returns a sequence of vectors of dimension 30 model.add(LSTM(30)) # return a single vector of dimension 30 model.add(Dense(1, activation='softmax')) model

ValueError: Trying to share variable rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel

不打扰是莪最后的温柔 提交于 2019-11-27 13:43:25
问题 This it the code: X = tf.placeholder(tf.float32, [batch_size, seq_len_1, 1], name='X') labels = tf.placeholder(tf.float32, [None, alpha_size], name='labels') rnn_cell = tf.contrib.rnn.BasicLSTMCell(512) m_rnn_cell = tf.contrib.rnn.MultiRNNCell([rnn_cell] * 3, state_is_tuple=True) pre_prediction, state = tf.nn.dynamic_rnn(m_rnn_cell, X, dtype=tf.float32) This is full error: ValueError: Trying to share variable rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel, but specified shape (1024, 2048)

How to calculate perplexity of RNN in tensorflow

放肆的年华 提交于 2019-11-27 13:29:33
问题 I'm running the word RNN implmentation of tensor flow of Word RNN How to calculate the perplexity of RNN. Following is the code in training that shows training loss and other things in each epoch: for e in range(model.epoch_pointer.eval(), args.num_epochs): sess.run(tf.assign(model.lr, args.learning_rate * (args.decay_rate ** e))) data_loader.reset_batch_pointer() state = sess.run(model.initial_state) speed = 0 if args.init_from is None: assign_op = model.batch_pointer.assign(0) sess.run

LSTM记录贴

半世苍凉 提交于 2019-11-27 12:47:14
LSTM记录贴 得平常再做一些任务都会用到LSTM(长短时记忆网络)、GRU以及一些变形的RNN模型,虽然平时用的多,但近来看面经,发现有些问题都忘了,回过头看了又忘,所以准备好好的手推一遍并且记录下来; 先上图 其实如果推理过LSTM的话,基本上这图就没啥用,因为这图说实话画的略显复杂。在真的面试遇到手推LSTM,方法应该是从三个门出发点去反向画图。 先说主人公三个门,分别是遗忘门(f)、输入门(i)、输出门(o) 还有出场的人员包括上个时间步细胞状态(ct-1)、本时刻的输入(xt)、上一时间步的输入(ht-1)、临时细胞状态(c~) 以及lstm每个unit特有的输入框架,相比普通RNN多了一个c而已 那么可以开始推导了 首先要确定,每个lstm会有三个输入(ct-1、ht-1、xt),两个输出(ct、ht) 而ht-1与xt都是向量,在论文中xt与ht-1是直接横向拼接在一起的(可以有不同的结合方式,但是基本是绑定的),因为两个向量都是定长,所以在经过每个门的线性变换矩阵输出也是定长。 那么开始第一个门,决定我们会从细胞状态中丢弃什么信息,所以叫遗忘门,f= sigmoid([xt,ht-1]Wf+bf),输出的是一个概率,遗忘概率。那么遗忘的对象是谁?那必然是上个时刻输入的细胞状态ct-1 那么可以画出图 输入门(或者更新门)i = sigmoid([xt,ht-1]Wi

Using pre-trained word2vec with LSTM for word generation

时光毁灭记忆、已成空白 提交于 2019-11-27 11:03:27
LSTM/RNN can be used for text generation. This shows way to use pre-trained GloVe word embeddings for Keras model. How to use pre-trained Word2Vec word embeddings with Keras LSTM model? This post did help. How to predict / generate next word when the model is provided with the sequence of words as its input? Sample approach tried: # Sample code to prepare word2vec word embeddings import gensim documents = ["Human machine interface for lab abc computer applications", "A survey of user opinion of computer system response time", "The EPS user interface management system", "System and human system

Keras LSTM input dimension setting

喜夏-厌秋 提交于 2019-11-27 08:58:03
I was trying to train a LSTM model using keras but I think I got something wrong here. I got an error of ValueError: Error when checking input: expected lstm_17_input to have 3 dimensions, but got array with shape (10000, 0, 20) while my code looks like model = Sequential() model.add(LSTM(256, activation="relu", dropout=0.25, recurrent_dropout=0.25, input_shape=(None, 20, 64))) model.add(Dense(1, activation="sigmoid")) model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) model.fit(X_train, y_train, batch_size=batch_size, epochs=10) where X_train has a shape of