lstm

Simplest Lstm training with Keras io

做~自己de王妃 提交于 2019-12-03 16:03:10
I would like to create the simplest LSTM there is using keras python library. I have the following code: import pandas as pd import numpy as np from keras.models import Sequential from keras.layers.core import Dense, Activation from keras.layers.recurrent import LSTM X_train = pd.DataFrame( np.array([ [1, 2], [3, 4], [5, 6], [7, 8], [5.1, 6.1], [7.1, 8.1] ])) y_train = pd.DataFrame( np.array([1, 2, 3, 4, 3, 4]) ) X_test = pd.DataFrame( np.array([ [1.1, 2.1], [3.1, 4.1] ]) ) y_test = pd.DataFrame( np.array([1, 2]) ) model = Sequential() model.add(LSTM( output_dim = 10, return_sequences=False,

TensorFlow dynamic_rnn regressor: ValueError dimension mismatch

佐手、 提交于 2019-12-03 14:57:48
I would like to build a toy LSTM model for regression. This nice tutorial is already too complicated for a beginner. Given a sequence of length time_steps , predict the next value. Consider time_steps=3 and the sequences: array([ [[ 1.], [ 2.], [ 3.]], [[ 2.], [ 3.], [ 4.]], ... the target values should be: array([ 4., 5., ... I define the following model: # Network Parameters time_steps = 3 num_neurons= 64 #(arbitrary) n_features = 1 # tf Graph input x = tf.placeholder("float", [None, time_steps, n_features]) y = tf.placeholder("float", [None, 1]) # Define weights weights = { 'out': tf

Batch-major vs time-major LSTM

给你一囗甜甜゛ 提交于 2019-12-03 14:00:34
Do RNNs learn different dependency patterns when the input is batch-major as opposed to time-major? (Edit: sorry my initial argument was why it makes sense but I realized that it doesn't so this is a little OT.) I haven't found the TF-groups reasoning behind this but it does does not make computational sense as ops are written in C++. Intuitively, we want to mash up (multiply/add etc) different features from the same sequence on the same timestep. Different timesteps can’t be done in parallell while batch/sequences can so feature>batch/sequence>timestep. By default Numpy and C++ uses row-major

Regularization for LSTM in tensorflow

喜夏-厌秋 提交于 2019-12-03 13:33:45
问题 Tensorflow offers a nice LSTM wrapper. rnn_cell.BasicLSTM(num_units, forget_bias=1.0, input_size=None, state_is_tuple=False, activation=tanh) I would like to use regularization, say L2 regularization. However, I don't have direct access to the different weight matrices used in the LSTM cell, so I cannot explicitly do something like loss = something + beta * tf.reduce_sum(tf.nn.l2_loss(weights)) Is there a way to access the matrices or use regularization somehow with LSTM? 回答1: tf.trainable

Keras LSTM training data format

ⅰ亾dé卋堺 提交于 2019-12-03 13:10:39
I am trying to use LSTM neural network (using Keras) to predict opponent's next move in the game Rock-Paper-Scissor. I have encode the inputs as Rock: [1 0 0], Paper: [0 1 0], Scissor: [0 0 1]. Now I want to train the neural network but I am a bit confused of the data structure of my training data. I have stored an opponent's game history in a .csv file with the following structure: 1,0,0 0,1,0 0,1,0 0,0,1 1,0,0 0,1,0 0,1,0 0,0,1 1,0,0 0,0,1 And I am trying to use every 5th data as my training label, and the previous 4 data as the training input. In another word, at each time step, a vector

Keras - Input a 3 channel image into LSTM

霸气de小男生 提交于 2019-12-03 12:22:43
I have read a sequence of images into a numpy array with shape (7338, 225, 1024, 3) where 7338 is the sample size, 225 are the time steps and 1024 (32x32) are flattened image pixels, in 3 channels (RGB). I have a sequential model with an LSTM layer: model = Sequential() model.add(LSTM(128, input_shape=(225, 1024, 3)) But this results in the error: Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=4 The documentation mentions that the input tensor for LSTM layer should be a 3D tensor with shape (batch_size, timesteps, input_dim) , but in my case my input_dim is 2D. What is

Unstanding LSTM

你说的曾经没有我的故事 提交于 2019-12-03 12:13:07
1.RNNs   我们可以把RNNs看成一个普通网络做多次复制后叠加在一起组合起来,每一个网络都会把输出传递到下一个网络中。   把RNNs按时间步上展开,就得到了下图;      从RNNs链状结构可以容易理解到他是和序列信息相关的。 2.长时期依赖存在的问题   随着相关信息和预测信息的间隔增大,RNNs很难把他们关联起来了。   但是,LSTMs能解决这个问题 3. LSTM网络   Long Short Term Memory networks(长短期记忆网络)通常叫为LSTMS。LSTMs被设计用于避免前面提到的长时期依赖,他们的本质就是能够记住很长时期的信息。   RNNs都是由完全相同的结构复制而成的,在普通的RNNs中,这个模块非常简单,比如仅是单一的tanh层。      LSTMs也有类似的结构,不过重复模块部分不是一个简单的tanh层,而是4个特殊层。       先定义用到的符号:    3.1 LSTMs的核心思想   LSTMs最关键地方在于cell,即绿色部分的状态和结构图上横穿的水平线      cell状态像是一条传送带,向量从cell上传过,只做了少量的线性运算,信息很容易穿过cell而不做改变(实现了长时期的记忆保留)   LSTMs通过门(gates)的结构来实现增加或者删除信息。   门可以实现选择性地让信息通过

LSTM容易混淆的地方

半腔热情 提交于 2019-12-03 10:20:25
1 如果只是学习怎么用LSTM,那么可以这么理解LSTM LSTM可以看成一个仓库,而这个仓库有三个门卫,他们的功能分别是 遗忘门 。决定什么样的物品需要从仓库中丢弃。 输入门 。决定输入的什么物品用来存放在仓库。 输出门 。根据输入的物品和当前仓库的状态决定输出什么。 但这三个门外怎么判断遗忘什么,输入什么和输出什么呢?这需要他们通过从历史的数据中学习,这样当未来输入货物时,就知道如何处理。 这就是为什么LSTM能够从历史数据中学习并记住知识的原因,就是有了这三个门。 我在学习LSTM过程中一直混淆的几个概念是 1 多时间步长 图片来自: https://www.jiqizhixin.com/articles/2019-04-01-8?from=synced&keyword=LSTM 也就是使用t-2,t-1,t 的去预测 t+1时刻的值 2 多变量 也就是使用多个特征去预测,上图没有用到多时间步长 https://machinelearningmastery.com/multivariate-time-series-forecasting-lstms-keras/ 这里有一片的博客,或许对你有帮助,其中需要注意的地方是: 他分别用多变量单步时长 和 多变量多步时长(Multiple Lag Timesteps)。 它们最大的不同点在于: 下图是单步时长:1 下面是多步时长:n

Keras LSTM on CPU faster than GPU?

北慕城南 提交于 2019-12-03 09:53:34
问题 I am testing LSTM networks on Keras and I am getting much faster training on CPU (5 seconds/epoch on i2600k 16GB) than on GPU (35secs on Nvidia 1060 6GB). GPU utilisation runs at around 15%, and I never see it over 30% when trying other LSTM networks including the Keras examples. When I run other types of networks MLP and CNN the GPU is much faster. I am using the latest theano 0.9.0dev4 and keras 1.2.0 The sequence has 50,000 timesteps with 3 inputs (ints). If the inputs are descending (3,2

Siamese Model with LSTM network fails to train using tensorflow

匿名 (未验证) 提交于 2019-12-03 09:52:54
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: Dataset Description The dataset contains a set of question pairs and a label which tells if the questions are same. e.g. "How do I read and find my YouTube comments?" , "How can I see all my Youtube comments?" , "1" The goal of the model is to identify if the given question pair is same or different. Approach I have created a Siamese network to identify if two questions are same. Following is the model: graph = tf.Graph() with graph.as_default(): embedding_placeholder = tf.placeholder(tf.float32, shape=embedding_matrix.shape, name='embedding