lstm

Multivariate LSTM with missing values

有些话、适合烂在心里 提交于 2019-11-26 16:43:42
问题 I am working on a Time Series Forecasting problem using LSTM. The input contains several features, so I am using a Multivariate LSTM. The problem is that there are some missing values, for example: Feature 1 Feature 2 ... Feature n 1 2 4 nan 2 5 8 10 3 8 8 5 4 nan 7 7 5 6 nan 12 Instead of interpolating the missing values, that can introduce bias in the results, because sometimes there are a lot of consecutive timestamps with missing values on the same feature, I would like to know if there

6、循环神经网络(RNN)

北城以北 提交于 2019-11-26 14:04:59
6.1 为什么需要RNN? ​时间序列数据是指在不同时间点上收集到的数据,这类数据反映了某一事物、现象等随时间的变化状态或程度。一般的神经网络,在训练数据足够、算法模型优越的情况下,给定特定的x,就能得到期望y。其一般处理单个的输入,前一个输入和后一个输入完全无关,但实际应用中,某些任务需要能够更好的处理序列的信息,即前面的输入和后面的输入是有关系的。比如: ​当我们在理解一句话意思时,孤立的理解这句话的每个词不足以理解整体意思,我们通常需要处理这些词连接起来的整个序列; 当我们处理视频的时候,我们也不能只单独的去分析每一帧,而要分析这些帧连接起来的整个序列。为了解决一些这样类似的问题,能够更好的处理序列的信息,RNN就由此诞生了。 6.2 图解RNN基本结构 6.2.1 基本的单层网络结构 ​在进一步了解RNN之前,先给出最基本的单层网络结构,输入是 $x$ ,经过变换 Wx+b 和激活函数 f 得到输出 y : 6.2.2 图解经典RNN结构 ​在实际应用中,我们还会遇到很多序列形的数据,如: 自然语言处理问题。x1可以看做是第一个单词,x2可以看做是第二个单词,依次类推。 语音处理。此时,x1、x2、x3……是每帧的声音信号。 时间序列问题。例如每天的股票价格等等。 其单个序列如下图所示: 前面介绍了诸如此类的序列数据用原始的神经网络难以建模,基于此,RNN引入了隐状态$h$

Neural Network LSTM input shape from dataframe

回眸只為那壹抹淺笑 提交于 2019-11-26 12:35:52
问题 I am trying to implement an LSTM with Keras. I know that LSTM\'s in Keras require a 3D tensor with shape (nb_samples, timesteps, input_dim) as an input. However, I am not entirely sure how the input should look like in my case, as I have just one sample of T observations for each input, not multiple samples, i.e. (nb_samples=1, timesteps=T, input_dim=N) . Is it better to split each of my inputs into samples of length T/M ? T is around a few million observations for me, so how long should each

Keras代码超详细讲解LSTM实现细节

给你一囗甜甜゛ 提交于 2019-11-26 12:20:25
1.首先我们了解一下keras中的Embedding层: from keras.layers.embeddings import Embedding: Embedding参数如下: 输入尺寸:(batch_size,input_length) 输出尺寸:(batch_size,input_length,output_dim) 举个例子:(随机初始化Embedding): from keras.models import Sequential from keras.layers import Embedding import numpy as np model = Sequential() model.add(Embedding(1000, 64, input_length=10)) # 输入大小为(None,10),Nnoe是batch_size大小,10代表每一个batch中有10条样本 # 输出大小为(None, 10, 64),其中64代表输入中每个每条样本被embedding成了64维的向量 input_array = np.random.randint(1000, size=(32, 10)) model.compile('rmsprop', 'mse') output_array = model.predict(input_array) print(output

LSTM module for Caffe

匆匆过客 提交于 2019-11-26 11:14:36
问题 Does anyone know if there exists a nice LSTM module for Caffe? I found one from a github account by russel91 but apparantly the webpage containing examples and explanations disappeared (Formerly http://apollo.deepmatter.io/ --> it now redirects only to the github page which has no examples or explanations anymore). 回答1: I know Jeff Donahue worked on LSTM models using Caffe. He also gave a nice tutorial during CVPR 2015. He has a pull-request with RNN and LSTM. Update : there is a new PR by

When does keras reset an LSTM state?

∥☆過路亽.° 提交于 2019-11-26 09:25:47
问题 I read all sorts of texts about it, and none seem to answer this very basic question. It\'s always ambiguous: In a stateful = False LSTM layer, does keras reset states after: Each sequence; or Each batch? Suppose I have X_train shaped as (1000,20,1), meaning 1000 sequences of 20 steps of a single value. If I make: model.fit(X_train, y_train, batch_size=200, nb_epoch=15) Will it reset states for every single sequence (resets states 1000 times)? Or will it reset states for every batch (resets

TensorFlow: Remember LSTM state for next batch (stateful LSTM)

删除回忆录丶 提交于 2019-11-26 09:07:26
问题 Given a trained LSTM model I want to perform inference for single timesteps, i.e. seq_length = 1 in the example below. After each timestep the internal LSTM (memory and hidden) states need to be remembered for the next \'batch\'. For the very beginning of the inference the internal LSTM states init_c, init_h are computed given the input. These are then stored in a LSTMStateTuple object which is passed to the LSTM. During training this state is updated every timestep. However for inference I

How to apply gradient clipping in TensorFlow?

为君一笑 提交于 2019-11-26 04:59:59
问题 Considering the example code. I would like to know How to apply gradient clipping on this network on the RNN where there is a possibility of exploding gradients. tf.clip_by_value(t, clip_value_min, clip_value_max, name=None) This is an example that could be used but where do I introduce this ? In the def of RNN lstm_cell = rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0) # Split data because rnn cell needs a list of inputs for the RNN inner loop _X = tf.split(0, n_steps, _X) # n_steps tf

TensorFlow初始化LSTM参数weight 和 bias

我只是一个虾纸丫 提交于 2019-11-26 02:51:25
TensorFlow 初始化 LSTM 参数 weight 和 bias 前言: 前一篇 博客 介绍了如何可视化神经网络的每一层,很简单的做法就是将训练好数据作为神经网络的初始化参数进行前向传播。在LSTM中我们可以从 官方文档 看到能初始化的参数只有weight。在可视化时我们往往需要传入weight和bias,或者载入模型参数继续训练也需要载入w和b。 初始化LSTM的weight 初始化w比较简单,设置为一个常量传入即可,这里get_parameter函数为将模型参数读入到numpy,可参考上一篇博客。因为是两层的LSTM每层参数不一样分开两次写入,用constant赋值w。 multi_rnn_cells = [get_parameter(model_dir,'generator_model/rnn/multi_rnn_cell/cell_0/lstm_cell/kernel'), get_parameter(model_dir, 'generator_model/rnn/multi_rnn_cell/cell_0/lstm_cell/bias'), get_parameter(model_dir, 'generator_model/rnn/multi_rnn_cell/cell_1/lstm_cell/kernel'), get_parameter(model_dir,

bert今生前世全总结

半腔热情 提交于 2019-11-26 01:35:27
一、Bert简介 谷歌AI实验室的BERT深刻影响了NLP的格局。 BERT之后,许多NLP架构、训练方法与语言模型如雨后春笋般涌现,比如谷歌的TransformerXL、OpenAI’s GPT-2、 XLNet、ERNIE2.0、 RoBERTa等。 BERT团队对该框架的描述: BERT全称Bidirectional Encoder Representations from Transformers(Transformers的双向编码表示),对未标注的文本,通过上下文约束预训练深层双向表示。训练完成后,只需要对BERT预训练模型进行fine-tune,再加上针对特定任务的输出层就可以取得SOTA(state of the art)结果。 BERT是在大量的未标注文本上预训练得到,包括整个Wikipedia(25亿单词)和图书语料库(8亿单词)。 BERT最吸引人的在于,我们仅仅通过在模型后根据自己的需求加上输出层部分就可以在各类NLP任务取得SOTA结果。 二、从Word Embedding到Bert 1. 图像预处理 自从深度学习火起来后,预训练过程就是做图像或者视频领域的一种比较常规的做法,有比较长的历史了,而且这种做法很有效,能明显促进应用的效果。 预处理过程如上图: 我们设计好网络结构以后,对于图像来说一般是CNN的多层叠加网络结构