lstm | 易学教程

How to use masking layer to mask input/output in LSTM autoencoders?

阅读更多关于 How to use masking layer to mask input/output in LSTM autoencoders?

问题 I am trying to use LSTM autoencoder to do sequence-to-sequence learning with variable lengths of sequences as inputs, using following code: inputs = Input(shape=(None, input_dim)) masked_input = Masking(mask_value=0.0, input_shape=(None,input_dim))(inputs) encoded = LSTM(latent_dim)(masked_input) decoded = RepeatVector(timesteps)(encoded) decoded = LSTM(input_dim, return_sequences=True)(decoded) sequence_autoencoder = Model(inputs, decoded) encoder = Model(inputs, encoded) where inputs are

Tensorflow LSTM - Matrix multiplication on LSTM cell

阅读更多关于 Tensorflow LSTM - Matrix multiplication on LSTM cell

问题 I'm making a LSTM neural network in Tensorflow. The input tensor size is 92. import tensorflow as tf from tensorflow.contrib import rnn import data test_x, train_x, test_y, train_y = data.get() # Parameters learning_rate = 0.001 epochs = 100 batch_size = 64 display_step = 10 # Network Parameters n_input = 28 # input size n_hidden = 128 # number of hidden layers n_classes = 20 # output size # Placeholders x = tf.placeholder(dtype=tf.float32, shape=[None, n_input]) y = tf.placeholder(dtype=tf

深度学习-LSTM与GRU

阅读更多关于深度学习-LSTM与GRU

http://www.sohu.com/a/259957763_610300 此篇文章绕开了数学公式，对LSTM与GRU采用图文并茂的方式进行说明，尤其是里面的动图，让人一目了然。　　在RNN训练期间，信息不断地循环往复，神经网络模型权重的更新非常大。因为在更新过程中累积了错误梯度，会导致网络不稳定。极端情况下，权重的值可能变得大到溢出并导致NaN值。爆炸通过拥有大于1的值的网络层反复累积梯度导致指数增长产生，如果值小于1就会出现消失。　　由于RNN 有一定的局限性它会出现梯度消失的情况不能长时间保存记忆例如： I am from China, I speak Chinese.这个句子中的China对Chinese具有一定的决定性，但是由于距离太远难以产生关联。为解决这一问题，LSTM使用了Gate(“门”)，它可以保存重要记忆 RNN LSTM LSTM的核心内容就是CT 信息流控制的关键，参数决定了h t 传递过程中，哪些被保存或舍弃。参数被Gate影响Sigmoid函数系数决定 C t 参数的变化，而Sigmoid 函数决定于－－输入，之前状态 gate 如何进行控制？ https://blog.csdn.net/m0epnwstyk4/article/details/79124800 方法：用门的输出向量按元素乘以我们需要控制的那个向量原理：门的输出是 0到1

Running LSTM with multiple GPUs gets “Input and hidden tensors are not at the same device”

阅读更多关于 Running LSTM with multiple GPUs gets “Input and hidden tensors are not at the same device”

I am trying to train a LSTM layer in pytorch. I am using 4 GPUs. When initializing, I added the .cuda() function move the hidden layer to GPU. But when I run the code with multiple GPUs I am getting this runtime error : RuntimeError: Input and hidden tensors are not at the same device I have tried to solve the problem by using .cuda() function in the forward function like below : self.hidden = (self.hidden[0].type(torch.FloatTensor).cuda(), self.hidden[1].type(torch.FloatTensor).cuda()) This line seems to solve the problem, but it raises my concern that if the updated hidden layer is seen in

Add dense layer before LSTM layer in keras or Tensorflow?

阅读更多关于 Add dense layer before LSTM layer in keras or Tensorflow?

问题 I am trying to implement a denoising autoencoder with an LSTM layer in between. The architecture goes following. FC layer -> FC layer -> LSTM cell -> FC layer -> FC layer. I am unable to understand how my input dimension should be to implement this architecture? I tried the following code batch_size = 1 model = Sequential() model.add(Dense(5, input_shape=(1,))) model.add(Dense(10)) model.add(LSTM(32)) model.add(Dropout(0.3)) model.add(Dense(5)) model.add(Dense(1)) model.compile(loss='mean

Using RNN to recover sine wave from noisy signal

阅读更多关于 Using RNN to recover sine wave from noisy signal

问题 I am involved with an application that needs to estimate the state of a certain system in real time by measuring a set of (non-linearly) dependent parameters. Up until now the application was using an extended Kalman filter, but it was found to be underperforming in certain circumstances, which is likely caused by the fact that the differences between the real system and its model used in the filter are too significant to be modeled as white noise. We cannot use a more precise model for a

Integrating BERT sentence embedding into a siamese LSTM network

阅读更多关于 Integrating BERT sentence embedding into a siamese LSTM network

I am working on a text similarity project and I wanted to experiment with a siamese LSTM network. I am working on modifying this implementation https://amitojdeep.github.io/amitoj-blogs/2017/12/31/semantic-similarity.html . The code is based on using Word2Vec word embeddings and I wanted to replace that with BERT sentence embeddings https://github.com/imgarylai/bert-embedding The resulting matrix has column 1 with the input sentence strings, column 2 with each cell containing the corresponding embedding matrix (num_words, 768). My understanding is that using this embedding matrix I can simply

TensorFlow dynamic_rnn regressor: ValueError dimension mismatch

阅读更多关于 TensorFlow dynamic_rnn regressor: ValueError dimension mismatch

问题 I would like to build a toy LSTM model for regression. This nice tutorial is already too complicated for a beginner. Given a sequence of length time_steps , predict the next value. Consider time_steps=3 and the sequences: array([ [[ 1.], [ 2.], [ 3.]], [[ 2.], [ 3.], [ 4.]], ... the target values should be: array([ 4., 5., ... I define the following model: # Network Parameters time_steps = 3 num_neurons= 64 #(arbitrary) n_features = 1 # tf Graph input x = tf.placeholder("float", [None, time

Keras LSTM - feed sequence data with Tensorflow dataset API from the generator

阅读更多关于 Keras LSTM - feed sequence data with Tensorflow dataset API from the generator

I am trying to solve how I can feed data to my LSTM model for training. (I will simplify the problem in my example below.) I have the following data format in csv files in my dataset. Timestep Feature1 Feature2 Feature3 Feature4 Output 1 1 2 3 4 a 2 5 6 7 8 b 3 9 10 11 12 c 4 13 14 15 16 d 5 17 18 19 20 e 6 21 22 23 24 f 7 25 26 27 28 g 8 29 30 31 32 h 9 33 34 35 36 i 10 37 38 39 40 j The task is to estimate the Output of any future timestep based on the data from last 3 timesteps. Some input-output exapmles are as following: Example 1: Input: Timestep Feature1 Feature2 Feature3 Feature4 1 1 2

Sentence embedding in keras

阅读更多关于 Sentence embedding in keras

I am trying a simple document classification using sentence embeddings in keras. I know how to feed word vectors to a network, but I have problems using sentence embeddings. In my case, I have a simple representation of sentences (adding the word vectors along the axis, for example np.sum(sequences, axis=0) ). My question is, what should I replace the Embedding layer with in the code below to feed sentence embeddings instead? model = Sequential() model.add(Embedding(len(embedding_weights), len(embedding_weights[0]), weights=[embedding_weights], mask_zero=True, input_length=MAX_SEQUENCE_LENGTH,