How a Convolutional Neural Net handles channels

空扰寡人 提交于 2019-12-24 10:37:09

问题


I've looks through a lot of explanations of the way a CNN conventionally handles multiple channels (such as 3 in an RGB image) and am still at a loss.

When a 5x5x3 filter (say) is applied to a patch of an RGB image what exactly happens? Is it in fact 3 different 2D convolutions (with independent weights) that happen separately to each channel? And then the results get simply added together to produce the final output to pass to the next layer? Or a truly 3D convolution?


回答1:


This image is from Andrew Ng's deeplearning.ai course. 6 X 6 X 3 - where 3 corresponds to 3 color channels. 6 X 6 being the height and widht of the image. For the convolution step we convolve the input image with 3 X 3 X 3 filter/kernel. The input image and filter both will have 3 layers. (Mostly both are same for input image and filter).The output will be 4 X 4 X 1. 3 X 3 X 3 gives you 27 features/parameters which you multiply with the corresponding Red, Green and blue channels. Finally add up all those numbers to get the value for [0,0] in 4 X 4 output image. Now move the yellow cube of the input image and slide it over 1 box to your right and once it reaches the right end, you slide the cube one row down and continue your multiplication to fill the 4 X 4 output. Would suggest you to take a paper and pencil, fill random values in all the cubes for input as well as the kernel and solve the multiplication.

For more details watch these lectures on youtube. https://www.youtube.com/watch?v=KTB_OFoAQcc&index=6&list=PLkDaE6sCZn6Gl29AoE31iwdVwSG-KnDzF

https://www.youtube.com/watch?v=7g8jpK4llkc&t=1s



来源:https://stackoverflow.com/questions/47982594/how-a-convolutional-neural-net-handles-channels

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!