rnn | 易学教程

循环神经网络原理

阅读更多关于循环神经网络原理

循环神经网络(RNN)简介，循环神经网络是一种专门处理序列(sequences)的神经网络。它们通常用于自然语言处理(NLP)任务，因为它们在处理文本方面非常有效。在本文中，我们将探索什么是RNNs，了解它们是如何工作的，并在Python中从头构建一个真正的RNNs(仅使用numpy)。这篇文章假设有神经网络的基本知识。我对神经网络的介绍涵盖了你需要知道的一切，所以我建议你先读一下。本文更完整的内容请参考极客教程的深度学习专栏： http://geek-docs.com/deep-learning/rnn/rnn-introduction.html 让我们开始吧! The Why 普通神经网络(以及CNNs)的一个问题是，它们只对预先确定的大小起作用:它们接受固定大小的输入并产生固定大小的输出。RNNs是有用的，因为它让我们有可变长度的序列作为输入和输出。下面是一些关于RNNs的例子: *输入为红色，RNN本身为绿色，输出为蓝色。这种处理序列的能力使 RNNs 非常有用。例如: 机器翻译(例如谷歌翻译)是通过“多对多”的RNNs来完成的。原始文本序列被输入一个RNN，然后RNN生成翻译文本作为输出。情绪分析(例如，这是一个积极的还是消极的评论?)通常是用“多对一”的RNNs来完成的。要分析的文本被输入一个RNN，然后RNN生成一个输出分类(例如，这是一个积极的评论)。

循环神经网络导读

阅读更多关于循环神经网络导读

循环神经网络导读循环神经网络（Recurrent Neural Network）是一类以序列数据为输入，在序列的演进方向进行递归且所有节点（循环单元）按链式连接的递归神经网络。其中双向循环神经网络（Bidirectional RNN, Bi-RNN）和长短期记忆网络（Long Short-Term Memory networks，LSTM）是常见的的循环神经网络。今天，小编带你认识常见的几种循环神经网络模型，主要内容来自Colah的博客，外加一些自己的总结，一起来学习吧~ 循环神经网络 RNNs 在阅读或思考的过程中，人类常常结合以前的信息得到结果。但是，普通的神经网络并不能做到这一点。这或许便是传统神经网络的主要缺点。循环神经网络可以做到这一点，通过循环，循环神经网络将当前步所学习到的信息传递下去，从而学会像人一样进行思考。上图即是循环神经网络的一个示例。可以看到，同普通的神经网络相似，其同样具有输入输出层以及隐层。但是，循环神经网络会结合当前步的输入以及上一步网络所给出的 hidden state , 计算出当前时间步的，并将作为输入输出到下一时间步的循环神经网络之中。同时，网络的其他部分会根据当前的状态计算出当前步的输出 . 给出计算公式为：为了便于理解，我们可以将循环神经网络视作一系列共享权值的网络，并将其展开。展开之后，可以感觉到循环神经网络似乎天然与

【CS224n】Lecture8 Notes

阅读更多关于【CS224n】Lecture8 Notes

注：这是2017年课程的lecture8。一直都在用RNN，但是对它内部的构造不甚了解，所以这次花了一个下午加一个晚上看了CS224n中关于RNN的推导，不敢说融会贯通，算是比以前清楚多了。做个笔记，便于日后查阅。 Overview 主要讲了以下几个内容：传统语言模型 RNN和RNN语言模型一些问题(梯度消失爆炸问题)和训练技巧 RNN的其他应用双向RNN和多层RNN 传统语言模型语言模型首先介绍语言模型的概念，简言之，语言模型描述了一个单词序列的概率，原文是a language model computes a probability for a sentence of words. 这样的好处是可以描述单词顺序以及更好的单词选择。关于单词顺序，课上举的例子是 \(P(the cat is small) > P(small is the cat)\) ，即正确语序在语言模型下的概率更高；关于单词选择，例子是 \(P(walking home after school) > P(walking house after school)\) 。也就是说概率越高，越像人话。传统语言模型定义：传统的语言模型计算的概率大多基于“窗口”大小，也就是 \(n-gram\) 中的 \(n\) 的大小。一个自然的理解是当前词的概率也应该和之前所有词的选择有关，即： \[P({w_t}

LSTM和双向LSTM讲解及实践

阅读更多关于 LSTM和双向LSTM讲解及实践

LSTM和双向LSTM讲解及实践目录 RNN的长期依赖问题 LSTM原理讲解双向LSTM原理讲解 Keras实现LSTM和双向LSTM 一、RNN的长期依赖问题在上篇文章中介绍的循环神经网络RNN在训练的过程中会有长期依赖的问题，这是由于RNN模型在训练时会遇到梯度消失(大部分情况)或者梯度爆炸(很少，但对优化过程影响很大)的问题。对于梯度爆炸是很好解决的，可以使用梯度修剪(Gradient Clipping)，即当梯度向量大于某个阈值，缩放梯度向量。但对于梯度消失是很难解决的。所谓的梯度消失或梯度爆炸是指训练时计算和反向传播，梯度倾向于在每一时刻递减或递增，经过一段时间后，梯度就会收敛到零(消失)或发散到无穷大(爆炸)。简单来说，长期依赖的问题就是在每一个时间的间隔不断增大时，RNN会丧失到连接到远处信息的能力。如下图，随着时间点t的不断递增，当t时刻和0时刻的时间间隔较大的时候，t时刻的记忆ht可能已经丧失了学习连接到远处0时刻的信息的能力了。假设X0的输入为”我住在深圳”，后面插入了很多其他的句子，然后在Xt输入了“我在市政府上班”。由于X0与Xt相差很远，当RNN输入到Xt时，t时刻的记忆ht已经丧失了X0时保存的信息了。因此在Xt时刻神经网络无法理解到我是在哪一个城市的市政府上班了。二、LSTM原理讲解在理论上，RNN绝对可以处理这样的长期依赖问题

RNN/LSTM deep learning model?

阅读更多关于 RNN/LSTM deep learning model?

I am trying to build an RNN/LSTM model for binary classification 0 or 1 a sample of my dataset (patient number, time in mill/sec., normalization of X Y and Z, kurtosis, skewness, pitch, roll and yaw, label) respectively. 1,15,-0.248010047716,0.00378335508419,-0.0152548459993,-86.3738760481,0.872322164158,-3.51314800063,0 1,31,-0.248010047716,0.00378335508419,-0.0152548459993,-86.3738760481,0.872322164158,-3.51314800063,0 1,46,-0.267422664673,0.0051143782875,-0.0191247001961,-85.7662354031,1.0928406847,-4.08015176908,0 1,62,-0.267422664673,0.0051143782875,-0.0191247001961,-85.7662354031,1

Multivariate LSTM Forecast Loss and evaluation

阅读更多关于 Multivariate LSTM Forecast Loss and evaluation

I have a CNN-RNN model architecture with Bidirectional LSTMS for time series regression problem. My loss does not converge over 50 epochs. Each epoch has 20k samples. The loss keeps bouncing between 0.001 - 0.01 . batch_size=1 epochs = 50 model.compile(loss='mean_squared_error', optimizer='adam') trainingHistory=model.fit(trainX,trainY,epochs=epochs,batch_size=batch_size,shuffle=False) I tried to train the model with incorrectly paired X and Y data for which the loss stays around 0.5 , is it reasonable conclusion that my X and Y have a non linear relationship which can be learned by my model

multi-head attention

阅读更多关于 multi-head attention

multi-head attention ■ 论文 | Attention Is All You Need ■ 链接 | https://www.paperweekly.site/papers/224 ■ 源码 | https://github.com/Kyubyong/transformer ■ 论文 | Weighted Transformer Network for Machine Translation ■ 链接 | https://www.paperweekly.site/papers/2013 ■ 源码 | https://github.com/JayParks/transformer 思想：舍弃 RNN，只用注意力模型来进行序列的建模新型的网络结构： Transformer，里面所包含的注意力机制称之为 self-attention。这套 Transformer 是能够计算 input 和 output 的 representation 而不借助 RNN 的的 model，所以作者说有 attention 就够了。模型：同样包含 encoder 和 decoder 两个 stage，encoder 和 decoder 都是抛弃 RNN，而是用堆叠起来的 self-attention，和 fully-connected layer 来完成，模型的架构如下：

tensorflow1.2中出现错误问题AttributeError: module 'tensorflow.contrib.rnn' has no attribute 'BasicLSTMCell'

阅读更多关于 tensorflow1.2中出现错误问题AttributeError: module 'tensorflow.contrib.rnn' has no attribute 'BasicLSTMCell'

（1） AttributeError: module 'tensorflow.contrib.rnn' has no attribute 'BasicLSTMCell' 原因是没有BasicLSTMCell，tensorflow1.2.1中改了函数位置改为 tf.nn.rnn_cell.BasicLSTMCell(num_hidden, forget_bias=1.0) 就好 (2)AttributeError: module 'tensorflow.python.ops.nn' has no attribute 'static_rnn' 原因是没有 static_rnn ，tensorflow1.2.1中改了函数位置改为 outputs, states = tf.nn.rnn(lstm_cell, x, dtype =tf.float32) 就好或者是替换为 tf.nn.dynamic_rnn，不过要注意输入来源： CSDN 作者：大漠帝国链接： https://blog.csdn.net/rensihui/article/details/80063183

AttributeError: module 'tensorflow.python.ops.rnn' has no attribute 'rnn'

阅读更多关于 AttributeError: module 'tensorflow.python.ops.rnn' has no attribute 'rnn'

TensorFlow原版本报错：AttributeError: module 'tensorflow.python.ops.rnn' has no attribute 'rnn' from tensorflow.python.ops import rnn, rnn_cell lstm_cell = rnn_cell.BasicLSTMCell(rnn_size,state_is_tuple=True) outputs, states = rnn.rnn(lstm_cell, x, dtype=tf.float32) 应该替换为： from tensorflow.contrib import rnn lstm_cell = rnn.BasicLSTMCell(rnn_size) outputs, states = rnn.static_rnn(lstm_cell, x, dtype=tf.float32) 来源： CSDN 作者：黄鑫huangxin 链接： https://blog.csdn.net/qq_33373858/article/details/83097027

How to use tensorflow's Dataset API Iterator as an input of a (recurrent) neural network?

阅读更多关于 How to use tensorflow's Dataset API Iterator as an input of a (recurrent) neural network?

问题 When using the tensorflow's Dataset API Iterator, my goal is to define an RNN that operates on the iterator's get_next() tensors as its input (see (1) in the code). However, simply defining the dynamic_rnn with get_next() as its input results in an error: ValueError: Initializer for variable rnn/basic_lstm_cell/kernel/ is from inside a control-flow construct, such as a loop or conditional. When creating a variable inside a loop or conditional, use a lambda as the initializer. Now I know that

订阅 rnn