问题
I'm trying to create the autoencoder for processes. Each process is a sequence of events and each event represents as number from 0 to 461 (and important, that events with close numbers are not similar, numbers were given out randomly). Each process has length 60 and total count of processes is n
. So my input data is array (n, 60)
.
First, I created the Embedding layer to convert events numbers to one-hot representation:
BLOCK_LEN = 60
EVENTS_CNT = 462
input = Input(shape=(BLOCK_LEN,))
embedded = Embedding(input_dim=EVENTS_CNT+1, input_length=BLOCK_LEN, output_dim=200)(input)
emb_model = Model(input, embedded)
emb_model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 60) 0
_________________________________________________________________
embedding_1 (Embedding) (None, 60, 200) 92600
=================================================================
Total params: 92,600
Trainable params: 92,600
Non-trainable params: 0
_________________________________________________________________
None
Second, I created the main Seq2Seq model (using that library):
seq_model = Seq2Seq(batch_input_shape=(None, BLOCK_LEN, 200), hidden_dim=200, output_length=BLOCK_LEN, output_dim=EVENTS_CNT)
Resulting model:
model = Sequential()
model.add(emb_model)
model.add(seq_model)
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
model_1 (Model) (None, 60, 200) 92600
_________________________________________________________________
model_12 (Model) (None, 60, 462) 1077124
=================================================================
Total params: 1,169,724
Trainable params: 1,169,724
Non-trainable params: 0
_________________________________________________________________
Also I have my own accuracy metric (because lib's accuracy doesn't appropriate for my data):
def symbol_acc(y_true, y_pred):
isEqual = K.cast(K.equal(y_true, y_pred), K.floatx())
return K.mean(isEqual)
And compile:
model.compile(loss=tf.losses.sparse_softmax_cross_entropy,optimizer='adam', target_tensors=[tf.placeholder(tf.int32, [None, 60])], metrics=[symbol_acc])
Why compile looks like that: At first model had one more layer model.add(TimeDistributed(Dense(EVENTS_CNT, activation='softmax')))
and compile was model.compile(loss=custom_categorical_crossentropy, optimizer='rmsprop', metrics=[symbol_acc])
. But such model produced an error "ValueError: Error when checking target: expected time_distributed_2 to have 3 dimensions, but got array with shape (2714, 60)". Now all shapes are suitable.
But now I have new problem (key moment of my story): shapes in metric symbol_acc
are dirrefent:
Shapes (symbol_acc): (?, 60) (?, ?, 462)
So the true
array has shape (?, 60)
and predicted - (?, ?, 462)
. Each value in true
60 values is a number from 0 to 461 (represents the true number of event) and each value in predicted
60 ones is a vector of size 462 of probability distribution for each number from 0 to 461 (for each of 462 events) (for each of 462 events). I want to make true
the same shape as predicted
: for each of 60 values make vector of size 462 with 1 on the event number position and 0s on the others.
So my questions:
- How to change shape of array in metric if before fitting model I have no data? Maxumum that I got is
K.gather(K.eye(462), tf.cast(number, tf.int32))
: that code creates one-hot array with 1 innumber
position. But I don't understand how I can apply it to array without knowing this array. - Maybe there is more simple way to solve that problem?
I'm new in keras and NNs, so I don't sure that all steps are correct. If you see any mistake please report.
回答1:
As I tested before, using target_tensors
will not work unless its shape is the same as the model's predicted shape.
So, this general rule cannot be violated:
Your output data must have the same shape as your model's ouptut
This makes y_true
and y_pred
certainly have the same shape.
What you need is to adapt your output data to the shape of your model, using to_categorical().
from keras.utils import to_categorical
one_hot_X = to_categorical(X_train,462)
With that you simply train your model normally, without having to create workarounds in losses and accuracies:
model.fit(X_train, one_hot_X,...)
If you run into memory problems by doing this, you may consider creating a generator that will convert only part of the data for each batch:
def batch_generator(batch_size):
while True: #keras generators must be infinite
#you may want to manually shuffle X_train here
for i in range(len(X_train)//batch_size): #make sure len is a multiple of batch_size
x = X_train[i*batch_size:(i+1)*batch_size]
y = to_categorical(x,462)
yield (x,y)
Train with:
model.fit_generator(batch_generator(size),....)
Fixing your accuracy for this case
Now that we know better what you're doing, your accuracy should use K.argmax
to get exact results (and not to consider 462 options while it should be 1, correct or not)
(My old answer was wrong, because I forgot that y_true
is exact, but y_pred
is approximated).
def symbol_acc(y_true, y_pred):
y_true = K.argmax(y_true) #this gets the class as an integer (comparable to X_train)
y_pred = K.argmax(y_pred) #transforming (any,60,462) into (any,60)
isEqual = K.cast(K.equal(y_true, y_pred), K.floatx())
return K.mean(isEqual)
Just a small correction:
Embeddings don't create "one-hot" representation, they just create a multi-feature representation. (One hot is strictly for the cases where only one element in the vector is one, but embeddings are free to any value in any element).
来源:https://stackoverflow.com/questions/50339622/keras-correctness-of-model-and-issues-with-custom-metric