问题
After reading up on the subject I don't fully understand: Is the 'convolution' in neural networks comparable to a simple downsampling or 'sharpening' function?
Can you break this term down into a simple, understandable image/analogy?
edit: Rephrase after 1st answer: Can pooling be understood as downsampling of weight matrices?
回答1:
Convolutional neural network is a family of models which are proved empirically to work great when it comes to image recognition. From this point of view - CNN is something completely different than downsampling.
But in framework used in CNN design there is something what is comparable to a downsampling technique. To fully understand that - you have to understand how CNN usually works. It is build by a hierarchical number of layers and at every layer you have a set of a trainable kernels which output has a dimension very similiar to spatial size of your input images.
This might be a serious problem - the output from such layer might be extremely huge (~ nr_of_kernels * size_of_kernel_output) which could make your computations intractable. This is the reason why a certain techniques are used in order to decrease size of the output:
- Stride, pad and kernel size manipulation: be setting these values to a certain value you could decrese the size of the output (on the other hand - you may lose some of important information).
- Pooling operation: pooling is an operation in which instead of passing as an output from a layer all outputs from all kernels - you might pass only specific aggregated statistics about it. It is considered as extremely useful and is widely used in CNN design.
For a detailed description you might visit this tutorial.
Edit: Yes, pooling is a kind of downsampling 😊
来源:https://stackoverflow.com/questions/38097111/convolutional-neural-networks-vs-downsampling