CNN: input stride vs. output stride

In the paper 'Fully Convolutional Networks for Semantic Segmentation' the author distinguishes between input stride and output stride in the context of deconvolution. How do these terms differ from each other?

Input stride is the stride of the filter . How much you shift the filter in the output .

Output Stride this is actually a nominal value . We get feature map in a CNN after doing several convolution , max-pooling operations . Let's say our input image is 224 * 224 and our final feature map is 7*7 .

Then we say our output stride is : 224/7 = 32 (Approximate of what happened to the image after down sampling .)

This tensorflow script describe what is this output stride , and how to use in FCN which is the case of dense prediction .

one uses inputs with spatial dimensions that are multiples of 32 plus 1, e.g., [321, 321]. In this case the feature maps at the ResNet output will have spatial shape [(height - 1) / output_stride + 1, (width - 1) / output_stride + 1] and corners exactly aligned with the input image corners, which greatly facilitates alignment of the features to the image. Using as input [225, 225] images results in [8, 8] feature maps at the output of the last ResNet block.

来源：https://stackoverflow.com/questions/44586894/cnn-input-stride-vs-output-stride

标签

conv-neural-network

convolution

deconvolution

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!