recurrent-neural-network

Recurrent convolutional BLSTM neural network - arbitrary sequence lengths

阅读更多关于 Recurrent convolutional BLSTM neural network - arbitrary sequence lengths

问题 Using Keras + Theano I successfully made a recurrent bidirectional-LSTM neural network that is capable of training on and classifying DNA sequences of arbitrary lengths, using the following model (for fully working code see: http://pastebin.com/jBLv8B72): sequence = Input(shape=(None, ONE_HOT_DIMENSION), dtype='float32') dropout = Dropout(0.2)(sequence) # bidirectional LSTM forward_lstm = LSTM( output_dim=50, init='uniform', inner_init='uniform', forget_bias_init='one', return_sequences=True,

Why does RNN always output 1

阅读更多关于 Why does RNN always output 1

问题 I am using Recurrent Neural Networks (RNN) for forecasting, but for some weird reason, it always outputs 1. Here I explain this with a toy example as: Example Consider a matrix M of dimensions (360, 5), and a vector Y which contains rowsum of M . Now, using RNN, I want to predict Y from M . Using rnn R package, I trained model as library(rnn) M <- matrix(c(1:1800),ncol=5,byrow = TRUE) # Matrix (say features) Y <- apply(M,1,sum) # Output equls to row sum of M mt <- array(c(M),dim=c(NROW(M),1

Seq2Seq model learns to only output EOS token (<\s>) after a few iterations

阅读更多关于 Seq2Seq model learns to only output EOS token () after a few iterations

问题 I am creating a chatbot trained on Cornell Movie Dialogs Corpus using NMT. I am basing my code in part from https://github.com/bshao001/ChatLearner and https://github.com/chiphuyen/stanford-tensorflow-tutorials/tree/master/assignments/chatbot During training, I print a random output answer fed to the decoder from the batch and the corresponding answer that my model predicts to observe the learning progress. My issue: After only about 4 iterations of training, the model learns to output the

Keras simple RNN implementation

阅读更多关于 Keras simple RNN implementation

问题 I found problems when trying to compile a network with one recurrent layer. It seems there is some issue with the dimensionality of the first layer and thus my understanding of how RNN layers work in Keras. My code sample is: model.add(Dense(8, input_dim = 2, activation = "tanh", use_bias = False)) model.add(SimpleRNN(2, activation = "tanh", use_bias = False)) model.add(Dense(1, activation = "tanh", use_bias = False)) The error is ValueError: Input 0 is incompatible with layer simple_rnn_1:

How to plot a learning curve for a keras experiment?

阅读更多关于 How to plot a learning curve for a keras experiment?

问题 I'm training an RNN using keras and would like to see how the validation accuracy changes with the data set size. Keras has a list called val_acc in its history object which gets appended after every epoch with the respective validation set accuracy (link to the post in google group). I want to get the average of val_acc for the number of epochs run and plot that against the respective data set size. Question: How can I retrieve the elements in the val_acc list and perform an operation like

How to generate a sentence from feature vector or words?

阅读更多关于 How to generate a sentence from feature vector or words?

I used VGG 16-Layer Caffe model for image captions and I have several captions per image. Now, I want to generate a sentence from those captions (words). I read in a paper on LSTM that I should remove the SoftMax layer from the training network and provide the 4096 feature vector from fc7 layer directly to LSTM. I am new to LSTM and RNN stuff. Where should I begin? Is there any tutorial showing how to generate sentence by sequence labeling? AFAIK the master branch of BVLC/caffe does not yet support a recurrent layer architecture. You should pull branch recurrent from jeffdonahue/caffe . This

Layer called with an input that isn't a symbolic tensor keras

阅读更多关于 Layer called with an input that isn't a symbolic tensor keras

问题 I'm trying to pass the output of one layer into two different layers and then join them back together. However, I'm being stopped by this error which is telling me that my input isn't a symbolic tensor. Received type: <class 'keras.layers.recurrent.LSTM'>. All inputs to the layers should be tensors. However, I believe I'm following the documentation quite closely: https://keras.io/getting-started/functional-api-guide/#multi-input-and-multi-output-models and am not entirely sure why this is

implementing RNN with numpy

阅读更多关于 implementing RNN with numpy

I'm trying to implement the recurrent neural network with numpy. My current input and output designs are as follow: x is of shape: (sequence length, batch size, input dimension) h : (number of layers, number of directions, batch size, hidden size) initial weight : (number of directions, 2 * hidden size, input size + hidden size) weight : (number of layers -1, number of directions, hidden size, directions*hidden size + hidden size) bias : (number of layers, number of directions, hidden size) I have looked up pytorch API of RNN the as reference ( https://pytorch.org/docs/stable/nn.html?highlight

Mixing feed forward layers and recurrent layers in Tensorflow?

阅读更多关于 Mixing feed forward layers and recurrent layers in Tensorflow?

Has anyone been able to mix feedforward layers and recurrent layers in Tensorflow? For example: input->conv->GRU->linear->output I can imagine one can define his own cell with feedforward layers and no state which can then be stacked using the MultiRNNCell function, something like: cell = tf.nn.rnn_cell.MultiRNNCell([conv_cell,GRU_cell,linear_cell]) This would make life a whole lot easier... can't you just do the following: rnnouts, _ = rnn(grucell, inputs) linearout = [tf.matmul(rnnout, weights) + bias for rnnout in rnnouts] etc. This tutoria l gives an example of how to use convolutional

Using RNN to recover sine wave from noisy signal

阅读更多关于 Using RNN to recover sine wave from noisy signal

问题 I am involved with an application that needs to estimate the state of a certain system in real time by measuring a set of (non-linearly) dependent parameters. Up until now the application was using an extended Kalman filter, but it was found to be underperforming in certain circumstances, which is likely caused by the fact that the differences between the real system and its model used in the filter are too significant to be modeled as white noise. We cannot use a more precise model for a