问题
I am converting some codes from tensorflow to tf.keras. I am used to upsample using tf.nn.conv2d_transpose which receive a output_shape parameter and everything works fine. When I swap into keras models I start using the Conv2DTranspose layer, but I can't get the desired output shape.
For sake of simplicity lets assume my input shape is (38,25), then I use maxpooling operation with kernel shape=(2,2) and strides=(2,2) which output a (19,13) volume. Now is when the 'deconvolution' is applied with kernel_size=(4,4), padding='same' and stride=(2,2) which obviously return a (38,26) volume.
I was trying all combinations of padding and output_padding and there's no way to get a volume of (38,25) as I did using tf.nn.conv2d_transpose(output_shape=(38,25)).
This sample code shows indeed what I mean:
import tensorflow as tf
import numpy as np
from tensorflow.keras.layers import Conv2DTranspose, Input, MaxPool2D, ZeroPadding2D
from tensorflow.keras import Model
I = Input(shape=(38,25,3))
X = MaxPool2D(pool_size=(2,2),strides=(2,2),padding='same')(I)
Y = []
#getting all possible outputs using different padding and output_padding
for padding in ['same','valid']:
for i in range(2): # [0,..,stride-1]
for j in range(2):
Y.append(Conv2DTranspose(3,(4, 4),strides=(2,2), padding=padding,\
name=padding+str(i)+str(j),use_bias=False, output_padding=(i,j))(X))
Y.append(Conv2DTranspose(3,(4, 4),strides=(2,2), padding=padding,\
name=padding+'_no_output_padding',use_bias=False)(X))
model = Model(inputs=I, outputs=Y, name = 'model')
model.summary()
Which gives the following output:
input_1 (InputLayer) (None, 38, 25, 3) 0
______________________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 19, 13, 3) 0 input_1[0][0]
______________________________________________________________________________
same00 (Conv2DTranspose) (None, 36, 24, 3) 144 max_pooling2d[0][0]
______________________________________________________________________________
same01 (Conv2DTranspose) (None, 36, 25, 3) 144 max_pooling2d[0][0]
______________________________________________________________________________
same10 (Conv2DTranspose) (None, 37, 24, 3) 144 max_pooling2d[0][0]
______________________________________________________________________________
same11 (Conv2DTranspose) (None, 37, 25, 3) 144 max_pooling2d[0][0]
______________________________________________________________________________
same_no_output_padding (Conv2DT (None, 38, 26, 3) 144 max_pooling2d[0][0]
______________________________________________________________________________
valid00 (Conv2DTranspose) (None, 40, 28, 3) 144 max_pooling2d[0][0]
______________________________________________________________________________
valid01 (Conv2DTranspose) (None, 40, 29, 3) 144 max_pooling2d[0][0]
______________________________________________________________________________
valid10 (Conv2DTranspose) (None, 41, 28, 3) 144 max_pooling2d[0][0]
______________________________________________________________________________
valid11 (Conv2DTranspose) (None, 41, 29, 3) 144 max_pooling2d[0][0]
______________________________________________________________________________
valid_no_output_padding (Conv2D (None, 40, 28, 3) 144 max_pooling2d[0][0]
Where name i j means the layer where the padding type is name and the output_padding used was (i,j). For example same01 means padding='same' and output_padding=(0,1). Note there's no combination which outputs the desired shape (38,25,3).
Meanwhile using tf.nn.conv2d_transpose:
x = np.arange(19*13*3).reshape([1,19,13,3]).astype('float32')
w = np.random.normal(size=(4,4,3,3)).astype('float32')
tf_op = tf.nn.conv2d_transpose(x,w,[1,38,25,3], strides=[1,2,2,1], padding='SAME')
with tf.Session() as sess:
print(sess.run(tf_op).shape)
Ouput: (1, 38, 25, 3)
Maybe there's some step that tf.nn.conv2d_transpose is applying and tf.keras isn't.
I hope someone help me. Thanks for reading.
来源:https://stackoverflow.com/questions/57640507/how-to-get-properly-the-outputs-shape-when-converting-code-from-tf-nn-conv2d-tr