lstm | 易学教程

Simplest Lstm training with Keras io

阅读更多关于 Simplest Lstm training with Keras io

I would like to create the simplest LSTM there is using keras python library. I have the following code: import pandas as pd import numpy as np from keras.models import Sequential from keras.layers.core import Dense, Activation from keras.layers.recurrent import LSTM X_train = pd.DataFrame( np.array([ [1, 2], [3, 4], [5, 6], [7, 8], [5.1, 6.1], [7.1, 8.1] ])) y_train = pd.DataFrame( np.array([1, 2, 3, 4, 3, 4]) ) X_test = pd.DataFrame( np.array([ [1.1, 2.1], [3.1, 4.1] ]) ) y_test = pd.DataFrame( np.array([1, 2]) ) model = Sequential() model.add(LSTM( output_dim = 10, return_sequences=False,

TensorFlow dynamic_rnn regressor: ValueError dimension mismatch

阅读更多关于 TensorFlow dynamic_rnn regressor: ValueError dimension mismatch

I would like to build a toy LSTM model for regression. This nice tutorial is already too complicated for a beginner. Given a sequence of length time_steps , predict the next value. Consider time_steps=3 and the sequences: array([ [[ 1.], [ 2.], [ 3.]], [[ 2.], [ 3.], [ 4.]], ... the target values should be: array([ 4., 5., ... I define the following model: # Network Parameters time_steps = 3 num_neurons= 64 #(arbitrary) n_features = 1 # tf Graph input x = tf.placeholder("float", [None, time_steps, n_features]) y = tf.placeholder("float", [None, 1]) # Define weights weights = { 'out': tf

Batch-major vs time-major LSTM

阅读更多关于 Batch-major vs time-major LSTM

Do RNNs learn different dependency patterns when the input is batch-major as opposed to time-major? (Edit: sorry my initial argument was why it makes sense but I realized that it doesn't so this is a little OT.) I haven't found the TF-groups reasoning behind this but it does does not make computational sense as ops are written in C++. Intuitively, we want to mash up (multiply/add etc) different features from the same sequence on the same timestep. Different timesteps can’t be done in parallell while batch/sequences can so feature>batch/sequence>timestep. By default Numpy and C++ uses row-major

Regularization for LSTM in tensorflow

阅读更多关于 Regularization for LSTM in tensorflow

问题 Tensorflow offers a nice LSTM wrapper. rnn_cell.BasicLSTM(num_units, forget_bias=1.0, input_size=None, state_is_tuple=False, activation=tanh) I would like to use regularization, say L2 regularization. However, I don't have direct access to the different weight matrices used in the LSTM cell, so I cannot explicitly do something like loss = something + beta * tf.reduce_sum(tf.nn.l2_loss(weights)) Is there a way to access the matrices or use regularization somehow with LSTM? 回答1: tf.trainable

Keras LSTM training data format

阅读更多关于 Keras LSTM training data format

I am trying to use LSTM neural network (using Keras) to predict opponent's next move in the game Rock-Paper-Scissor. I have encode the inputs as Rock: [1 0 0], Paper: [0 1 0], Scissor: [0 0 1]. Now I want to train the neural network but I am a bit confused of the data structure of my training data. I have stored an opponent's game history in a .csv file with the following structure: 1,0,0 0,1,0 0,1,0 0,0,1 1,0,0 0,1,0 0,1,0 0,0,1 1,0,0 0,0,1 And I am trying to use every 5th data as my training label, and the previous 4 data as the training input. In another word, at each time step, a vector

Keras - Input a 3 channel image into LSTM

阅读更多关于 Keras - Input a 3 channel image into LSTM

I have read a sequence of images into a numpy array with shape (7338, 225, 1024, 3) where 7338 is the sample size, 225 are the time steps and 1024 (32x32) are flattened image pixels, in 3 channels (RGB). I have a sequential model with an LSTM layer: model = Sequential() model.add(LSTM(128, input_shape=(225, 1024, 3)) But this results in the error: Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=4 The documentation mentions that the input tensor for LSTM layer should be a 3D tensor with shape (batch_size, timesteps, input_dim) , but in my case my input_dim is 2D. What is

Unstanding LSTM

阅读更多关于 Unstanding LSTM

1.RNNs 　　我们可以把RNNs看成一个普通网络做多次复制后叠加在一起组合起来，每一个网络都会把输出传递到下一个网络中。　　把RNNs按时间步上展开，就得到了下图；　　　　从RNNs链状结构可以容易理解到他是和序列信息相关的。 2.长时期依赖存在的问题　　随着相关信息和预测信息的间隔增大，RNNs很难把他们关联起来了。　　但是，LSTMs能解决这个问题 3. LSTM网络　　Long Short Term Memory networks(长短期记忆网络）通常叫为LSTMS。LSTMs被设计用于避免前面提到的长时期依赖，他们的本质就是能够记住很长时期的信息。　　RNNs都是由完全相同的结构复制而成的，在普通的RNNs中，这个模块非常简单，比如仅是单一的tanh层。　　　　LSTMs也有类似的结构，不过重复模块部分不是一个简单的tanh层，而是4个特殊层。　　　　　先定义用到的符号：　　 3.1 LSTMs的核心思想　　LSTMs最关键地方在于cell，即绿色部分的状态和结构图上横穿的水平线　　　　cell状态像是一条传送带，向量从cell上传过，只做了少量的线性运算，信息很容易穿过cell而不做改变(实现了长时期的记忆保留）　　LSTMs通过门(gates)的结构来实现增加或者删除信息。　　门可以实现选择性地让信息通过

LSTM容易混淆的地方

阅读更多关于 LSTM容易混淆的地方

1 如果只是学习怎么用LSTM，那么可以这么理解LSTM LSTM可以看成一个仓库，而这个仓库有三个门卫，他们的功能分别是遗忘门。决定什么样的物品需要从仓库中丢弃。输入门。决定输入的什么物品用来存放在仓库。输出门。根据输入的物品和当前仓库的状态决定输出什么。但这三个门外怎么判断遗忘什么，输入什么和输出什么呢？这需要他们通过从历史的数据中学习，这样当未来输入货物时，就知道如何处理。这就是为什么LSTM能够从历史数据中学习并记住知识的原因，就是有了这三个门。我在学习LSTM过程中一直混淆的几个概念是 1 多时间步长图片来自： https://www.jiqizhixin.com/articles/2019-04-01-8?from=synced&keyword=LSTM 也就是使用t-2，t-1,t 的去预测 t+1时刻的值 2 多变量也就是使用多个特征去预测，上图没有用到多时间步长 https://machinelearningmastery.com/multivariate-time-series-forecasting-lstms-keras/ 这里有一片的博客，或许对你有帮助，其中需要注意的地方是：他分别用多变量单步时长和多变量多步时长（Multiple Lag Timesteps）。它们最大的不同点在于：下图是单步时长：1 下面是多步时长：n

Keras LSTM on CPU faster than GPU?

阅读更多关于 Keras LSTM on CPU faster than GPU?

问题 I am testing LSTM networks on Keras and I am getting much faster training on CPU (5 seconds/epoch on i2600k 16GB) than on GPU (35secs on Nvidia 1060 6GB). GPU utilisation runs at around 15%, and I never see it over 30% when trying other LSTM networks including the Keras examples. When I run other types of networks MLP and CNN the GPU is much faster. I am using the latest theano 0.9.0dev4 and keras 1.2.0 The sequence has 50,000 timesteps with 3 inputs (ints). If the inputs are descending (3,2

Siamese Model with LSTM network fails to train using tensorflow

阅读更多关于 Siamese Model with LSTM network fails to train using tensorflow

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试): 问题: Dataset Description The dataset contains a set of question pairs and a label which tells if the questions are same. e.g. "How do I read and find my YouTube comments?" , "How can I see all my Youtube comments?" , "1" The goal of the model is to identify if the given question pair is same or different. Approach I have created a Siamese network to identify if two questions are same. Following is the model: graph = tf.Graph() with graph.as_default(): embedding_placeholder = tf.placeholder(tf.float32, shape=embedding_matrix.shape, name='embedding

订阅 lstm