I\'m building a model that converts a string to another string using recurrent layers (GRUs). I have tried both a Dense and a TimeDistributed(Dense) layer as the last-but-on
Here is a piece of code that verifies TimeDistirbuted(Dense(X)) is identical to Dense(X):
import numpy as np
from keras.layers import Dense, TimeDistributed
import tensorflow as tf
X = np.array([ [[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12]
],
[[3, 1, 7],
[8, 2, 5],
[11, 10, 4],
[9, 6, 12]
]
]).astype(np.float32)
print(X.shape)
(2, 4, 3)
dense_weights = np.array([[0.1, 0.2, 0.3, 0.4, 0.5],
[0.2, 0.7, 0.9, 0.1, 0.2],
[0.1, 0.8, 0.6, 0.2, 0.4]])
bias = np.array([0.1, 0.3, 0.7, 0.8, 0.4])
print(dense_weights.shape)
(3, 5)
dense = Dense(input_dim=3, units=5, weights=[dense_weights, bias])
input_tensor = tf.Variable(X, name='inputX')
output_tensor1 = dense(input_tensor)
output_tensor2 = TimeDistributed(dense)(input_tensor)
print(output_tensor1.shape)
print(output_tensor2.shape)
(2, 4, 5)
(2, ?, 5)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
output1 = sess.run(output_tensor1)
output2 = sess.run(output_tensor2)
print(output1 - output2)
And the difference is:
[[[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]]]