I came across a paper for making photo sketches to realistic images of people where they employ multi-scale CNNs to extract the detailed features in their encoder-decoder ar