LSTM Autoencoder

匿名 (未验证) 提交于 2019-12-03 02:16:02

问题:

I'm trying to build a LSTM autoencoder with the goal of getting a fixed sized vector from a sequence, which represents the sequence as good as possible. This autoencoder consists of two parts:

  • LSTM Encoder: Takes a sequence and returns an output vector (return_sequences = False)
  • LSTM Decoder: Takes an output vector and returns a sequence (return_sequences = True)

So, in the end, the encoder is a many to one LSTM and the decoder is a one to many LSTM.

Image source: Andrej Karpathy

On a high level the coding looks like this (similar as described here):

encoder = Model(...) decoder = Model(...)  autoencoder = Model(encoder.inputs, decoder(encoder(encoder.inputs)))  autoencoder.compile(loss='binary_crossentropy',               optimizer='adam',               metrics=['accuracy'])  autoencoder.fit(data, data,           batch_size=100,           epochs=1500) 

The shape (number of training examples, sequence length, input dimension) of the data array is (1200, 10, 5) and looks like this:

array([[[1, 0, 0, 0, 0],         [0, 1, 0, 0, 0],         [0, 0, 1, 0, 0],         ...,          [0, 0, 0, 0, 0],         [0, 0, 0, 0, 0],         [0, 0, 0, 0, 0]],         ... ] 

Problem: I am not sure how to proceed, especially how to integrate LSTM to Model and how to get the decoder to generate a sequence from a vector.

I am using keras with tensorflow backend.

EDIT: If someone wants to try out, here is my procedure to generate random sequences with moving ones (including padding):

import random import math  def getNotSoRandomList(x):     rlen = 8     rlist = [0 for x in range(rlen)]     if x <= 7:         rlist[x] = 1     return rlist   sequence = [[getNotSoRandomList(x) for x in range(round(random.uniform(0, 10)))] for y in range(5000)]  ### Padding afterwards  from keras.preprocessing import sequence as seq  data = seq.pad_sequences(     sequences = sequence,     padding='post',     maxlen=None,     truncating='post',     value=0. ) 

回答1:

Models can be any way you want. If I understood it right, you just want to know how to create models with LSTM?

Using LSTMs

Well, first, you have to define what your encoded vector looks like. Suppose you want it to be an array of 20 elements, a 1-dimension vector. So, shape (None,20). The size of it is up to you, and there is no clear rule to know the ideal one.

And your input must be three-dimensional, such as your (1200,10,5). In keras summaries and error messages, it will be shown as (None,10,5), as "None" represents the batch size, which can vary each time you train/predict.

There are many ways to do this, but, suppose you want only one LSTM layer:

from keras.layers import * from keras.models import Model  inpE = Input((10,5)) #here, you don't define the batch size    outE = LSTM(units = 20, return_sequences=False, ...optional parameters...)(inpE) 

This is enough for a very very simple encoder resulting in an array with 20 elements (but you can stack more layers if you want). Let's create the model:

encoder = Model(inpE,outE)    

Now, for the decoder, it gets obscure. You don't have an actual sequence anymore, but a static meaningful vector. You may want to use LTSMs still, they will suppose the vector is a sequence.

But here, since the input has shape (None,20), you must first reshape it to some 3-dimensional array in order to attach an LSTM layer next.

The way you will reshape it is entirely up to you. 20 steps of 1 element? 1 step of 20 elements? 10 steps of 2 elements? Who knows?

inpD = Input((20,))    outD = Reshape((10,2))(inpD) #supposing 10 steps of 2 elements     

It's important to notice that if you don't have 10 steps anymore, you won't be able to just enable "return_sequences" and have the output you want. You'll have to work a little. Acually, it's not necessary to use "return_sequences" or even to use LSTMs, but you may do that.

Since in my reshape I have 10 timesteps (intentionally), it will be ok to use "return_sequences", because the result will have 10 timesteps (as the initial input)

outD1 = LSTM(5,return_sequences=True,...optional parameters...)(outD)     #5 cells because we want a (None,10,5) vector.    

You could work in many other ways, such as simply creating a 50 cell LSTM without returning sequences and then reshaping the result:

alternativeOut = LSTM(50,return_sequences=False,...)(outD)     alternativeOut = Reshape((10,5))(alternativeOut) 

And our model goes:

decoder = Model(inpD,outD1)   alternativeDecoder = Model(inpD,alternativeOut)    

After that, you unite the models with your code and train the autoencoder. All three models will have the same weights, so you can make the encoder bring results just by using its predict method.

encoderPredictions = encoder.predict(data) 

What I often see about LSTMs for generating sequences is something like predicting the next element.

You take just a few elements of the sequence and try to find the next element. And you take another segment one step forward and so on. This may be helpful in generating sequences.



回答2:

You can find a simple of sequence to sequence autoencoder here: https://blog.keras.io/building-autoencoders-in-keras.html



文章来源: LSTM Autoencoder
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!