lstm

LSTM-Attention Layer Network dimensions for classification task

感情迁移 提交于 2020-01-25 06:49:23
问题 I figured I build an attention model, but got confused (again) regarding each layers dimension. So say, I have 90 documents, each being composed by 200 sentence-vectors. The sentence-vectors are of size 500 (each sentence embedded as 1x500). The task is a classification of each document and the sentence-vectors are already embedded ! #Creating randm features xx = np.random.randint(100, size=(90,200,500)) y = np.random.randint(2, size=(90,1)) In the end, the attention-layer should return the

元学习系列(六):神经图灵机详细分析

女生的网名这么多〃 提交于 2020-01-25 02:08:57
神经图灵机是LSTM、GRU的改进版本,本质上依然包含一个外部记忆结构、可对记忆进行读写操作,主要针对读写操作进行了改进,或者说提出了一种新的读写操作思路。 神经图灵机之所以叫这个名字是因为它通过深度学习模型模拟了图灵机,但是我觉得如果先去介绍图灵机的概念,就会搞得很混乱,所以这里主要从神经图灵机改进了LSTM的哪些方面入手进行讲解,同时,由于模型的结构比较复杂,为了让思路更清晰,这次也会分开几个部分进行讲解。 概述 首先我们来看一下神经图灵机是怎么运作的: 神经图灵机和LSTM一样,在每个时刻接受输入并返回输出,输入首先会通过controller处理,controller把处理过的输入以及一系列参数传给读写头,读写头会根据这些东西,计算权重,并对记忆矩阵memory进行消除、写入、读取操作,最后读头返回读取的记忆给controller,controller就可以根据这个记忆计算该时刻的输出,然后就可以等待下一时刻的输入了。 记忆权重 个人觉得神经图灵机主要改进了LSTM的门结构,以前,我们通过对上一时刻的输出和当前时刻的输入分别进行线性变换并相加,再用sigmoid函数进行处理,得到一个用于记忆或者遗忘的权重向量,再和长期记忆按位相乘。这种权重的计算机制是通过神经网络基于数据进行学习,当然可行。而神经图灵机则从注意力的角度,用更接近人的思维,提出分别从content

Tensorflow LSTM Error (ValueError: Shapes must be equal rank, but are 2 and 1 )

流过昼夜 提交于 2020-01-24 12:43:26
问题 I know this questions have been asked many times but i am kind of new to tensorflow and none of the previous threads could solve my issue. I am trying to implement a LSTM for series of sensor data to classify data. I want my data be classified as 0 or 1 so its a binary classifier. I have over all 2539 samples which each of them have 555 time_steps and each time_step carries 9 features so my input has shape (2539, 555, 9) and for each sample and i have a label array which hold the value 0 or 1

add LSTM/GRU to BERT embeddings in keras tensorflow

℡╲_俬逩灬. 提交于 2020-01-24 11:34:10
问题 I am experimenting with BERT embeddings following this code https://github.com/strongio/keras-bert/blob/master/keras-bert.py These are the important bits of the code (lines 265-267): bert_output = BertLayer(n_fine_tune_layers=3)(bert_inputs) dense = tf.keras.layers.Dense(256, activation="relu")(bert_output) pred = tf.keras.layers.Dense(1, activation="sigmoid")(dense) I want to add a GRU between BertLayer and the Dense layer bert_output = BertLayer(n_fine_tune_layers=3)(bert_inputs) gru_out =

Pytorch基础——使用 RNN 生成简单序列

 ̄綄美尐妖づ 提交于 2020-01-23 19:39:18
一、介绍 内容 使用 RNN 进行序列预测 今天我们就从一个基本的使用 RNN 生成简单序列的例子中,来窥探神经网络生成符号序列的秘密。 我们首先让神经网络模型学习形如 0^n 1^n 形式的上下文无关语法。然后再让模型尝试去生成这样的字符串。在流程中将演示 RNN 及 LSTM 相关函数的使用方法。 实验知识点 什么是上下文无关文法 使用 RNN 或 LSTM 模型生成简单序列的方法 探究 RNN 记忆功能的内部原理 二、什么是上下文无关语法 上下文无关语法 首先让我们观察以下序列: 01 0011 000111 00001111 …… 它们有什么特点和规律呢? 它们都只含有 0 和 1 并连续地出现,序列长度并不相等,但在每条序列中 0 和 1 的个数是相等的。我们可以用一个简单的数学表达式来表述所有这些 01 序列的通用规律,其实就是 0^n 1^n,其中 n 就是序列中 0 或者 1 的个数。这样的序列看似简单,但其实它在计算机科学中有一个非常响亮的名字,叫做“上下文无关文法”(Context-free grammar)。所谓上下文无关文法,简单来说,就是可以被一组替代规则所生成,而与本身所处的上下文(前后出现的字符)无关。 上下文无关语法序列的生成 针对上面这种 0^n 1^n 形式的上下文无关语法序列,我们人类要学会数出 0 的个数 n,这样也就自然知道了 1 的个数

Keras LSTM dense layer multidimensional input

最后都变了- 提交于 2020-01-23 12:53:33
问题 I'm trying to create a keras LSTM to predict time series. My x_train is shaped like 3000,15,10 (Examples, Timesteps, Features), y_train like 3000,15,1 and I'm trying to build a many to many model (10 input features per sequence make 1 output / sequence). The code I'm using is this: model = Sequential() model.add(LSTM( 10, input_shape=(15, 10), return_sequences=True)) model.add(Dropout(0.2)) model.add(LSTM( 100, return_sequences=True)) model.add(Dropout(0.2)) model.add(Dense(1, activation=

Keras LSTM dense layer multidimensional input

天大地大妈咪最大 提交于 2020-01-23 12:52:48
问题 I'm trying to create a keras LSTM to predict time series. My x_train is shaped like 3000,15,10 (Examples, Timesteps, Features), y_train like 3000,15,1 and I'm trying to build a many to many model (10 input features per sequence make 1 output / sequence). The code I'm using is this: model = Sequential() model.add(LSTM( 10, input_shape=(15, 10), return_sequences=True)) model.add(Dropout(0.2)) model.add(LSTM( 100, return_sequences=True)) model.add(Dropout(0.2)) model.add(Dense(1, activation=

Keras LSTM dense layer multidimensional input

限于喜欢 提交于 2020-01-23 12:52:32
问题 I'm trying to create a keras LSTM to predict time series. My x_train is shaped like 3000,15,10 (Examples, Timesteps, Features), y_train like 3000,15,1 and I'm trying to build a many to many model (10 input features per sequence make 1 output / sequence). The code I'm using is this: model = Sequential() model.add(LSTM( 10, input_shape=(15, 10), return_sequences=True)) model.add(Dropout(0.2)) model.add(LSTM( 100, return_sequences=True)) model.add(Dropout(0.2)) model.add(Dense(1, activation=

Sentence embedding in keras

时光怂恿深爱的人放手 提交于 2020-01-23 03:59:25
问题 I am trying a simple document classification using sentence embeddings in keras. I know how to feed word vectors to a network, but I have problems using sentence embeddings. In my case, I have a simple representation of sentences (adding the word vectors along the axis, for example np.sum(sequences, axis=0) ). My question is, what should I replace the Embedding layer with in the code below to feed sentence embeddings instead? model = Sequential() model.add(Embedding(len(embedding_weights),

Batch normalization layer for CNN-LSTM

只谈情不闲聊 提交于 2020-01-22 16:11:09
问题 Suppose that I have a model like this (this is a model for time series forecasting): ipt = Input((data.shape[1] ,data.shape[2])) # 1 x = Conv1D(filters = 10, kernel_size = 3, padding = 'causal', activation = 'relu')(ipt) # 2 x = LSTM(15, return_sequences = False)(x) # 3 x = BatchNormalization()(x) # 4 out = Dense(1, activation = 'relu')(x) # 5 Now I want to add batch normalization layer to this network. Considering the fact that batch normalization doesn't work with LSTM, Can I add it before