Getting the output shape of deconvolution layer using tf.nn.conv2d_transpose in tensorflow

后端未结

关注

 3  868

终归单人心 2020-12-15 01:39

According to this paper, the output shape is N + H - 1, N is input height or width, H is kernel height or width. This is obvious inver

3条回答

一向 (楼主)

2020-12-15 02:38
This discussion is really helpful. Just add some additional information. padding='SAME' can also let the bottom and right side get the one additional padded pixel. According to TensorFlow document, and the test case below
```
strides = [1, 2, 2, 1]
# Input, output: [batch, height, width, depth]
x_shape = [2, 6, 4, 3]
y_shape = [2, 12, 8, 2]

# Filter: [kernel_height, kernel_width, output_depth, input_depth]
f_shape = [3, 3, 2, 3]
```
is using padding='SAME'. We can interpret padding='SAME' as:
```
(W−F+pad_along_height)/S+1 = out_height,
(W−F+pad_along_width)/S+1 = out_width.
```
So (12 - 3 + pad_along_height) / 2 + 1 = 6, and we get pad_along_height=1. And pad_top=pad_along_height/2 = 1/2 = 0(integer division), pad_bottom=pad_along_height-pad_top=1.

As for padding='VALID', as the name suggested, we use padding when it is proper time to use it. At first, we assume that the padded pixel = 0, if this doesn't work well, then we add 0 padding where any value outside the original input image region. For example, the test case below,
```
strides = [1, 2, 2, 1]

# Input, output: [batch, height, width, depth]
x_shape = [2, 6, 4, 3]
y_shape = [2, 13, 9, 2]

# Filter: [kernel_height, kernel_width, output_depth, input_depth]
f_shape = [3, 3, 2, 3]
```
The output shape of conv2d is
```
out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
           = ceil(float(13 - 3 + 1) / float(3)) = ceil(11/3) = 6
           = (W−F)/S + 1.
```
Cause (W−F)/S+1 = (13-3)/2+1 = 6, the result is an integer, we don't need to add 0 pixels around the border of the image, and pad_top=1/2, pad_left=1/2 in the TensorFlow document padding='VALID' section are all 0.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...