lstm | 易学教程

Using Keras for video prediction (time series)

阅读更多关于 Using Keras for video prediction (time series)

问题 I want to predict the next frame of a (greyscale) video given N previous frames - using CNNs or RNNs in Keras. Most tutorials and other information regarding time series prediction and Keras use a 1-dimensional input in their network but mine would be 3D (N frames x rows x cols) I'm currently really unsure what a good approach for this problem would be. My ideas include: Using one or more LSTM layers. The problem here is that I'm not sure whether they're suited to take a series of images

Keras using Tensorflow backend— masking on loss function

阅读更多关于 Keras using Tensorflow backend— masking on loss function

问题 I am trying to implement a sequence-to-sequence task using LSTM by Keras with Tensorflow backend. The inputs are English sentences with variable lengths. To construct a dataset with 2-D shape [batch_number, max_sentence_length], I add EOF at the end of line and pad each sentence with enough placeholders, e.g. "#". And then each character in sentence is transformed to one-hot vector, now the dataset has 3-D shape [batch_number, max_sentence_length, character_number]. After LSTM encoder and

Use keras(TensorFlow) to build a Conv2D+LSTM model

阅读更多关于 Use keras(TensorFlow) to build a Conv2D+LSTM model

问题 The data are 10 videos and each videos split into 86 frames and each frame has 28*28 pixels, video_num = 10 frame_num = 86 pixel_num = 28*28 I want to use Conv2D+LSDM to build the Model, and at each time_steps(=frame_num=86) send the pixels data (=INPUT_SIZE=28*28) in the model.So the following is my code about the Model BATCH_SIZE = 2 (just try) TIME_STEPS=frame_num (=86) INPUT_SIZE=pixel_num (=28*28) model = Sequential() model.add(InputLayer(batch_input_shape=(BATCH_SIZE, TIME_STEPS, INPUT

ValueError: Trying to share variable rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel

阅读更多关于 ValueError: Trying to share variable rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel

This it the code: X = tf.placeholder(tf.float32, [batch_size, seq_len_1, 1], name='X') labels = tf.placeholder(tf.float32, [None, alpha_size], name='labels') rnn_cell = tf.contrib.rnn.BasicLSTMCell(512) m_rnn_cell = tf.contrib.rnn.MultiRNNCell([rnn_cell] * 3, state_is_tuple=True) pre_prediction, state = tf.nn.dynamic_rnn(m_rnn_cell, X, dtype=tf.float32) This is full error: ValueError: Trying to share variable rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel, but specified shape (1024, 2048) and found shape (513, 2048). I'm using a GPU version of tensorflow. Maosi Chen I encountered a similar

【pytorch】pytorch-LSTM

阅读更多关于【pytorch】pytorch-LSTM

pytorch-LSTM() torch.nn包下实现了LSTM函数，实现LSTM层。多个LSTMcell组合起来是LSTM。 LSTM自动实现了前向传播，不需要自己对序列进行迭代。 LSTM的用到的参数如下：创建LSTM指定如下参数，至少指定前三个参数 input_size: 输入特征维数 hidden_size: 隐层状态的维数 num_layers: RNN层的个数，在图中竖向的是层数，横向的是seq_len bias: 隐层状态是否带bias，默认为true batch_first: 是否输入输出的第一维为batch_size，因为pytorch中batch_size维度默认是第二维度，故此选项可以将 batch_size放在第一维度。如input是(4,1,5)，中间的1是batch_size，指定batch_first=True后就是(1,4,5) dropout: 是否在除最后一个RNN层外的RNN层后面加dropout层 bidirectional: 是否是双向RNN，默认为false，若为true，则num_directions=2，否则为1 为了统一，以后都 batch_first=True LSTM的输入为： LSTM(input,(h0,co)) 其中，指定 batch_first=True 后，input就是 (batch_size,seq_len

RNN笔记

阅读更多关于 RNN笔记

目录 RNN笔记模型结构 LSTM GRU back propagations Back Propagation: RNN Back Propagation: LSTM Back Propagation: GRU back propagation代码 RNN 参考资料 RNN笔记模型结构循环神经网络可以用来处理一些序列问题，其网络结构如下（图片来源于 colah's blog ）在 $t$ 时刻输入的特征$, \mathbf x, $ 经过A的处理变换为 $\, \mathbf h_t$ ，其中A代表一定的处理过程，不同的RNN结构处理过程也不近相同。下图为最基本的RNN网络结构，参数$, \rm U, $作用于输入特征$ , \mathbf x $，参数$ , \rm W, $作用于前一时刻状态$ , {\rm s}_{t-1} $，经过激活函数得到当前时刻状态$ , {\rm s}_t $，之后经过$ , \rm V, $和激活函数的作用得到当前时刻的输出$ , {\rm o}_t$，其对应的变换公式如下： \[ \begin{align*} s_t &= \sigma({\rm W}s_{t-1}+{\rm U}{\mathbf x}_t+{\rm b}_S)\\ o_t &= \sigma({\rm V}s_t+{\rm b}_o)

How to construct input data to LSTM for time series multi-step horizon with external features?

阅读更多关于 How to construct input data to LSTM for time series multi-step horizon with external features?

问题 I'm trying to use LSTM to do store sales forecast. Here is how my raw data look like: | Date | StoreID | Sales | Temperature | Open | StoreType | |------------|---------|-------|-------------|---------|-----------| | 01/01/2016 | 1 | 0 | 36 | 0 | 1 | | 01/02/2016 | 1 | 10100 | 42 | 1 | 1 | | ... | 12/31/2016 | 1 | 14300 | 39 | 1 | 1 | | 01/01/2016 | 2 | 25000 | 46 | 1 | 3 | | 01/02/2016 | 2 | 23700 | 43 | 1 | 3 | | ... | 12/31/2016 | 2 | 20600 | 37 | 1 | 3 | | ... | 12/31/2016 | 10 | 19800 |

Keras: the difference between LSTM dropout and LSTM recurrent dropout

阅读更多关于 Keras: the difference between LSTM dropout and LSTM recurrent dropout

From the Keras documentation: dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. recurrent_dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. Can anyone point to where on the image below each dropout happens? I suggest taking a look at (the first part of) this paper . Regular dropout is applied on the inputs and/or the outputs, meaning the vertical arrows from x_t and to h_t . In your case, if you add it as an argument to your layer, it will mask the inputs; you can add a

PyTorch - contiguous()

阅读更多关于 PyTorch - contiguous()

问题 I was going through this example of a LSTM language model on github (link). What it does in general is pretty clear to me. But I'm still struggling to understand what calling contiguous() does, which occurs several times in the code. For example in line 74/75 of the code input and target sequences of the LSTM are created. Data (stored in ids ) is 2-dimensional where first dimension is the batch size. for i in range(0, ids.size(1) - seq_length, seq_length): # Get batch inputs and targets

How to use return_sequences option and TimeDistributed layer in Keras?

阅读更多关于 How to use return_sequences option and TimeDistributed layer in Keras?

问题 I have a dialog corpus like below. And I want to implement a LSTM model which predicts a system action. The system action is described as a bit vector. And a user input is calculated as a word-embedding which is also a bit vector. t1: user: "Do you know an apple?", system: "no"(action=2) t2: user: "xxxxxx", system: "yyyy" (action=0) t3: user: "aaaaaa", system: "bbbb" (action=5) So what I want to realize is "many to many (2)" model. When my model receives a user input, it must output a system