recurrent-neural-network

What's the difference between a bidirectional LSTM and an LSTM?

社会主义新天地 提交于 2019-12-02 17:17:11
Can someone please explain this? I know bidirectional LSTMs have a forward and backward pass but what is the advantage of this over a unidirectional LSTM? What is each of them better suited for? LSTM in its core, preserves information from inputs that has already passed through it using the hidden state. Unidirectional LSTM only preserves information of the past because the only inputs it has seen are from the past. Using bidirectional will run your inputs in two ways, one from past to future and one from future to past and what differs this approach from unidirectional is that in the LSTM

Soft attention vs. hard attention

纵饮孤独 提交于 2019-12-02 15:16:50
In this blog post, The Unreasonable Effectiveness of Recurrent Neural Networks , Andrej Karpathy mentions future directions for neural networks based machine learning: The concept of attention is the most interesting recent architectural innovation in neural networks. [...] soft attention scheme for memory addressing is convenient because it keeps the model fully-differentiable, but unfortunately one sacrifices efficiency because everything that can be attended to is attended to (but softly). Think of this as declaring a pointer in C that doesn't point to a specific address but instead defines

What's the difference between convolutional and recurrent neural networks?

强颜欢笑 提交于 2019-12-02 15:04:14
I'm new to the topic of neural networks. I came across the two terms convolutional neural network and recurrent neural network . I'm wondering if these two terms are referring to the same thing, or, if not, what would be the difference between them? Difference between CNN and RNN are as follows: CNN: CNN takes a fixed size inputs and generates fixed-size outputs. CNN is a type of feed-forward artificial neural network - are variations of multilayer perceptrons which are designed to use minimal amounts of preprocessing. CNNs use connectivity pattern between its neurons and is inspired by the

How to use Bidirectional RNN and Conv1D in keras when shapes are not matching?

此生再无相见时 提交于 2019-12-02 10:35:04
问题 I am brand new to Deep-Learning so I'm reading though Deep Learning with Keras by Antonio Gulli and learning a lot. I want to start using some of the concepts. I want to try and implement a neural network with a 1-dimensional convolutional layer that feeds into a bidirectional recurrent layer (like the paper below). All the tutorials or code snippets I've encountered do not implement anything remotely similar to this (e.g. image recognition) or use an older version of keras with different

How to use Bidirectional RNN and Conv1D in keras when shapes are not matching?

放肆的年华 提交于 2019-12-02 08:22:05
I am brand new to Deep-Learning so I'm reading though Deep Learning with Keras by Antonio Gulli and learning a lot. I want to start using some of the concepts. I want to try and implement a neural network with a 1-dimensional convolutional layer that feeds into a bidirectional recurrent layer (like the paper below). All the tutorials or code snippets I've encountered do not implement anything remotely similar to this (e.g. image recognition) or use an older version of keras with different functions and usage. What I'm trying to do is a variation of this paper : (1) convert DNA sequences to one

Classifying sequences of different lengths [duplicate]

隐身守侯 提交于 2019-12-02 01:14:33
问题 This question already has answers here : How do I create a variable-length input LSTM in Keras? (3 answers) Closed 2 years ago . Despite going through multiple examples, I still don't understand how to classify sequences of varying length using Keras, similar to this question. I can train a network that detects frequencies of sinusoid with varying length, by using masking: from keras import models from keras.layers.recurrent import LSTM from keras.layers import Dense, Masking from keras

MultiRNN and static_rnn error: Dimensions must be equal, but are 256 and 129

≯℡__Kan透↙ 提交于 2019-12-02 00:18:18
问题 I want to build an LSTM network with 3 Layers. Here's the code: num_layers=3 time_steps=10 num_units=128 n_input=1 learning_rate=0.001 n_classes=1 ... x=tf.placeholder("float",[None,time_steps,n_input],name="x") y=tf.placeholder("float",[None,n_classes],name="y") input=tf.unstack(x,time_steps,1) lstm_layer=rnn_cell.BasicLSTMCell(num_units,state_is_tuple=True) network=rnn_cell.MultiRNNCell([lstm_layer for _ in range(num_layers)],state_is_tuple=True) outputs,_=rnn.static_rnn(network,inputs

How to use Tensorflow's PTB model example?

百般思念 提交于 2019-12-01 17:55:27
I'm trying out Tensorflow's rnn example . With some problems at the start I could run the example in order to train the ptb and now I have a model trained. How do I exactly use the model now to create sentences without having to train every time again? I'm running it with a command like python ptb_word_lm.py --data_path=/home/data/ --model medium --save_path=/home/medium Is there a example somewhere on how to use the trained model to make sentences? 1.Add the following code at the last line of PTBModel:__init__() function: self._output_probs = tf.nn.softmax(logits) 2.Add the following function

How to use Tensorflow's PTB model example?

十年热恋 提交于 2019-12-01 17:32:56
问题 I'm trying out Tensorflow's rnn example. With some problems at the start I could run the example in order to train the ptb and now I have a model trained. How do I exactly use the model now to create sentences without having to train every time again? I'm running it with a command like python ptb_word_lm.py --data_path=/home/data/ --model medium --save_path=/home/medium Is there a example somewhere on how to use the trained model to make sentences? 回答1: 1.Add the following code at the last

Tensorflow: What are the “output_node_names” for freeze_graph.py in the model_with_buckets model?

独自空忆成欢 提交于 2019-12-01 14:50:59
问题 I trained a tf.nn.seq2seq.model_with_buckets with seq2seq = tf.nn.seq2seq.embedding_attention_seq2seq very similar to the example in the Tensorflow Tutorial. Now I would like to freeze the graph using freeze_graph.py . How can I find the "output_node_names" in my model? 回答1: You can choose names for the nodes in your model by passing the optional name="myname" argument to pretty much any Tensorflow operator that builds a node. Tensorflow will pick names for graph nodes automatically if you