lstm

LSTM应用汇总

故事扮演 提交于 2019-12-28 00:39:02
#定义LSTM lstm = tf.nn.rnn_cell.BasicLSTMCell(lstm_hidden_size) #将lstm的状态初始化为全0数组 #state.c和state.h分别对应图中的c状态和h状态 #和其他神经网络一样,在优化循环神经网络时,每次也会使用一个batch的训练样本。 state = lstm.zero_state(batch_size,tf.float32) #定义损失 loss = 0.0 #前馈网络 for i in range(num_steps): #每一步处理时间序列中的一个时刻。将当前输入current_input(Xt)和前一时刻状态 #state(Ht-1和Ct-1)传入定义的LSTM结构可以得到当前LSTM的输出lstm_output(Ht)和 #更新后状态state(Ht和Ct)。lstm_output用于输出给其他层,state用于输出给下一时刻, #它们在dropout等方面可以有不同的处理方式。 lstm_output,state = lstm(current_input,state) #将当前时刻LSTM结构的输出传入一个全连接层得到最后的输出 final_output = fully_connected(lstm_output) #计算当前时刻输出的损失 loss += calc_loss(final_output

Recurrent Neural Networks vs LSTM

走远了吗. 提交于 2019-12-26 11:04:39
Recurrent Neural Network RNN擅长处理序列问题。下面我们就来看看RNN的原理。 可以这样描述:如上图所述,网络的每一个output都会对应一个memory单元用于存储这一时刻网络的输出值, 然后这个memory会作为下一时刻输入的一部分传入RNN,如此循环下去。 下面来看一个例子。 假设所有神经元的weight都为1,没有bias,所有激励函数都是linear,memory的初始值为0. 输入序列[1,1],[1,1],[2,2].....,来以此计算输出。 对输入[1,1],output为1×1+1×1 + 0×1 = 2->2*1+2*1 = 4,最后输出为[4,4],然后将[4,4]存入memory单元,作为下一时刻的部分输入。 最后得到的输出序列是这样的。 而如果每次输入的序列不同,最后的输出序列也会不一样。 在RNN中,每次都是使用相同的网络结构,只是每次的输入和memory会不同。 这样就使我们在句子分析中,能够辨别同一个词出现在不同位置的时候的不同意思。 当然RNN也可以是深层的网络。这里会有两种不同的RNN类型Elman和Jordan。 还有双向的RNN,可以兼顾句子的前后部分。 Long Short-term Memory (LSTM) 上面就是一个LSTM的cell的结构,每个cell有4个input, 和1个output。

indices = 2 is not in [0, 1)

谁说我不能喝 提交于 2019-12-25 18:47:19
问题 I'm working on a seq2sql project and I successfully build a model but when training I get an error. I'm not using any Keras embedding layer. M=13 #Question Length d=40 #Dimention of the LSTM C=12 #number of table Columns batch_size=9 inputs1=Input(shape=(M,100),name='question_token') Hq=Bidirectional(LSTM(d,return_sequences=True),name='QuestionENC')(inputs1) #this is HQ shape is (num_samples,13,80) inputs2=Input(shape=(C,3,100),name='col_token') col_lstm_layer=Bidirectional(LSTM(d,return

涨姿势!一文了解深度学习中的注意力机制

早过忘川 提交于 2019-12-25 16:39:20
全文共 11413 字,预计学习时长 33 分钟 图源:Unsplash “每隔一段时间,就会出现一种能改变一切的革命性产品。” ——史蒂夫·乔布斯(SteveJobs) 这句21世纪最知名的言论之一与深度学习有什么关系呢? 想想看。计算能力的提升带来了一系列前所未有的突破。 若要追根溯源,答案将指向注意力机制。简而言之,这一全新概念正在改变我们应用深度学习的方式。 图源:Unsplash 注意力机制是过去十年中,深度学习研究领域最具价值的突破之一。 它催生了包括Transformer架构和Google的BERT在内的自然语言处理(NLP)领域的许多近期突破。如果你目前(或打算)从事NLP相关工作,一定要了解什么是注意力机制及其工作原理。 本文会讨论几种注意力机制的基础、流程及其背后的基本假设和直觉,并会给出一些数学公式来完整表达注意力机制,以及能让你在Python中轻松实现注意力相关架构的代码。 大纲 l 注意力机制改变了我们应用深度学习算法的方式 l 注意力机制彻底改变了自然语言处理(NLP)甚至计算机视觉等领域 l 本文将介绍注意力机制在深度学习中的工作原理,以及如何用Python将其实现 目录 1.什么是注意力? 1. 深度学习是如何引入注意力机制的 2. 了解注意力机制 2.使用Keras在Python中实现简单的注意力模型 3.全局与局部注意力 4

Tensorflow - LSTM state reuse within batch

孤街浪徒 提交于 2019-12-25 08:24:36
问题 I am working on a Tensorflow NN which uses an LSTM to track a parameter (time series data regression problem). A batch of training data contains a batch_size of consecutive observations. I would like to use the LSTM state as input to the next sample. So, if I have a batch of data observations, I would like to feed the state of the first observation as input to the second observation and so on. Below I define the lstm state as a tensor of size = batch_size. I would like to reuse the state

Multiply multiple tensors pairwise keras

夙愿已清 提交于 2019-12-25 04:14:05
问题 I want to ask if it is possible to multiply two tensors pairwise. So for example, I have tensor output from LSTM layer, lstm=LSTM(128,return_sequences=True)(input) output=some_function()(lstm) some_function() should do h1*h2,h2*h3....hn-1*hn I found How do I take the squared difference of two Keras tensors? little helpful but since, I will have trainable paramter, I will have to make my own layer. Also, will some_function layer interpret input dimension automatically as it will be hn-1 I am

How can I grid search different values for my keras model in python?

自闭症网瘾萝莉.ら 提交于 2019-12-25 02:15:33
问题 I've implemented a LSTM in keras. In that I am using the following three values: embedding_size hidden_layer_size learning_rate I want now to find the values which fit best into my model. So for example I have 3 values I can assign to each property (like [embedding_size: [100, 150, 200], hidden_layer_size: [50, 100, 150], learning_rate: [0.015,0.01,0.005]] ) What I would love now to know is which combination works best in my function. I thought I can build my function like this: def lstm

How to mask the inputs in an LSTM autoencoder having a RepeatVector() layer?

泄露秘密 提交于 2019-12-25 01:43:50
问题 I have been trying to obtaining a vector representation of a sequence of vectors using an LSTM autoencoder so that I can classify the sequence using a SVM or other such supervised algorithms. The amount of data is preventing me from using a fully connected dense layer for classification. The shortest size of my input is 7 timesteps and the longest sequence is 356 timesteps. Accordingly, I have padded the shorter sequences with zeros to obtain a final x_train of shape (1326, 356, 8) where 1326

Tensorflow之RNN,LSTM

青春壹個敷衍的年華 提交于 2019-12-25 00:33:48
Tensorflow之RNN,LSTM #!/usr/bin/env python2 # -*- coding: utf-8 -*- """ tensorflow之RNN 循环神经网络做手写数据集分类 """ import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data #设置随机数来比较两种计算结果 tf.set_random_seed(1) #导入手写数据集 mnist = input_data.read_data_sets('MNIST_data', one_hot=True) #设置参数 lr = 0.001 training_iters = 100000 batch_size = 128 n_inputs = 28 # MNIST 输入为图片(img shape: 28*28)对应到图片像素的一行 n_steps = 28 # time steps 对应到图片有多少列 n_hidden_units = 128 # 隐藏层神经元个数 n_classes = 10 # MNIST分类结果为10 #定义权重 weights = { #(28,128) 'in': tf.Variable(tf.random_normal([n_inputs, n_hidden_units]))

Keras LSTM multiple errors from trying to create model architecture

点点圈 提交于 2019-12-24 21:06:40
问题 This is a duplicate Question that i posted earlier today, in the other question i was using an old version of Keras. I've upgraded to Keras 2.0.0 and still was getting a lot of errors that i can't figure out on my own so i'm reposting the question mostly verbatim. I am trying to understand how to use keras for supply chain forecasting and i keep getting errors that i can't find help for elsewhere. I've tried to do similar tutorials; sunspot forecasting tutorial, pollution multivariate