There are times when a purely linear network can give useful results. Say we have a network of three layers with shapes (3,2,3). By limiting the middle layer to only two dimensions, we get a result that is the "plane of best fit" in the original three dimensional space.
But there are easier ways to find linear transformations of this form, such as NMF, PCA etc. However, this is a case where a multi-layered network does NOT behave the same way as a single layer perceptron.