lstm

When to stop training neural networks?

自古美人都是妖i 提交于 2019-12-06 08:12:43
问题 I'm trying to carry out a domain-specific classification research using RNN and have accumulated tens of millions of texts. Since it takes days and even months to run the whole dataset over, I only picked a small portion of it for testing, say 1M texts (80% for training, 20% for validation). I pre-trained the whole corpus with word vectorization and I also applied Dropout to the model to avoid over-fitting. When it trained 60000 text within 12 hrs, the loss had already dropped to a fairly low

Testing the Keras sentiment classification with model.predict

不问归期 提交于 2019-12-06 08:07:20
I have trained the imdb_lstm.py on my PC. Now I want to test the trained network by inputting some text of my own. How do I do it? Thank you! So what you basically need to do is as follows: Tokenize sequnces: convert the string into words (features): For example: "hello my name is georgio" to ["hello", "my", "name", "is", "georgio"]. Next, you want to remove stop words (check Google for what stop words are). This stage is optional, it may lead to faulty results but I think it worth a try. Stem your words (features), that way you'll reduce the number of features which will lead to a faster run.

Keras - Making two predictions from one neural network

ⅰ亾dé卋堺 提交于 2019-12-06 07:50:32
I'm trying to combine two outputs that are produced by the same network that makes predictions on a 4 class task and a 10 class task. Then I look to combine these outputs to give a length 14 array which I use as my end target. While this seems to work actively the predictions are always for one class so it produces a probability dist which is only concerned with selecting 1 out of the 14 options instead of 2. What I actually need it to do is to provide 2 predictions, one for each class. I want this all to be produced by the same model. input = Input(shape=(100, 100), name='input') lstm = LSTM

LSTM Sequence Prediction in Keras just outputs last step in the input

心已入冬 提交于 2019-12-06 07:24:07
I am currently working with Keras using Tensorflow as the backend. I have a LSTM Sequence Prediction model shown below that I am using to predict one step ahead in a data series (input 30 steps [each with 4 features], output predicted step 31). model = Sequential() model.add(LSTM( input_dim=4, output_dim=75, return_sequences=True)) model.add(Dropout(0.2)) model.add(LSTM( 150, return_sequences=False)) model.add(Dropout(0.2)) model.add(Dense( output_dim=4)) model.add(Activation("linear")) model.compile(loss="mse", optimizer="rmsprop") return model The issue I'm having is that after training the

RNN、LSTM、GRU的简单介绍

别等时光非礼了梦想. 提交于 2019-12-06 06:58:37
一、RNN    1.为什么需要RNN     RNN,中文’循环神经网络‘,解决的是时间序列问题。什么是时间序列问题呢,就是我们的样本数据之间在时间维度上存在关联的,跟一般的神经网络不一样,也就是说我们前一个输入和后一个输入有某种说不清道不明的关系,需要RNN这种特定结构的神经网络去寻找内部联系。    2.基本网 络结构     下面是RNN的一个基本结构和一个官网结构,第一个图里左边是未展开的形式,右边是展开的形式 ,将每一个时刻T=t的过程统统列举出来,它们共享参数W、U、V,最终的损失是每一个时刻的损失累加    3.前向传播    对于时刻t,存在:        ,     其中ϕ为激活函数,一般来说会选择tanh函数,b为偏置,而t时刻的输出O为:            输出的预测值y_hat为:             σ为激活函数,通常RNN用于分类,一般用softmax函数。     针对每个时刻经过这样的计算,我们就能得到每个时刻的y_hat和h_t,以及各自的损失,再通过累加每个时刻的损失值得到最终的损失,这样就算是一次前向传播结束了。    4.反向传播       我们说模型的训练其实就是:将数据喂到模型里,找到一批参数使得损失函数最小的过程。那么最重要的就是这批参数的更新过程了,我们用的是梯度下降的过程来更新参数,本质就这一个公式,敲黑板啦!! W

What is the architecture behind the Keras LSTM Layer implementation?

房东的猫 提交于 2019-12-06 06:34:58
问题 How does the input dimensions get converted to the output dimensions for the LSTM Layer in Keras? From reading Colah's blog post, it seems as though the number of "timesteps" (AKA the input_dim or the first value in the input_shape ) should equal the number of neurons, which should equal the number of outputs from this LSTM layer (delineated by the units argument for the LSTM layer). From reading this post, I understand the input shapes. What I am baffled by is how Keras plugs the inputs into

Predicting Next Word of LSTM Model from Tensorflow Example

大憨熊 提交于 2019-12-06 06:31:19
My buddy and I are trying to utilize the trained model from the LSTM tensorflow example here . We've been able to train our model, save the model, and then import the model. We've just used tensorflow's Supervisor. It was in the tutorial, but you can read more about it here . It's weird because there's not a whole lot of clear documentation for this. I understand that tensorflow is an API that's going through a lot of changes and adaptations right now, but it's hard to find clear answers. For example, we want to use tf.train.Saver() , but we aren't sure if there is anything comparable to tf

Format time-series data for short term forecasting using Recurrent Neural networks

穿精又带淫゛_ 提交于 2019-12-06 05:37:14
I want to forecast day-ahead power consumption using recurrent neural networks (RNN). But, I find the required data format (samples, timesteps, features) for RNN as confusing. Let me explain with an example as: I have power_dataset.csv on dropbox, which contains power consumption from 5 June to 18 June at 10 minutely rate (144 observations per day). Now, to check the performance of RNN using rnn R package, I am following these steps train model M for the usage of 17 June by using data from 5-16 June predict usage of 18 June by using M and updated usage from 6-17 June My understanding of RNN

Recurrent convolutional BLSTM neural network - arbitrary sequence lengths

时光怂恿深爱的人放手 提交于 2019-12-06 04:50:56
问题 Using Keras + Theano I successfully made a recurrent bidirectional-LSTM neural network that is capable of training on and classifying DNA sequences of arbitrary lengths, using the following model (for fully working code see: http://pastebin.com/jBLv8B72): sequence = Input(shape=(None, ONE_HOT_DIMENSION), dtype='float32') dropout = Dropout(0.2)(sequence) # bidirectional LSTM forward_lstm = LSTM( output_dim=50, init='uniform', inner_init='uniform', forget_bias_init='one', return_sequences=True,

Why does RNN always output 1

此生再无相见时 提交于 2019-12-06 04:43:15
问题 I am using Recurrent Neural Networks (RNN) for forecasting, but for some weird reason, it always outputs 1. Here I explain this with a toy example as: Example Consider a matrix M of dimensions (360, 5), and a vector Y which contains rowsum of M . Now, using RNN, I want to predict Y from M . Using rnn R package, I trained model as library(rnn) M <- matrix(c(1:1800),ncol=5,byrow = TRUE) # Matrix (say features) Y <- apply(M,1,sum) # Output equls to row sum of M mt <- array(c(M),dim=c(NROW(M),1