A feed-forward neural network with linear activation and any number of hidden layers is equivalent to just a linear neural neural network with no hidden layer. For example lets consider the neural network in figure with two hidden layers and no activation
y = h2 * W3 + b3
= (h1 * W2 + b2) * W3 + b3
= h1 * W2 * W3 + b2 * W3 + b3
= (x * W1 + b1) * W2 * W3 + b2 * W3 + b3
= x * W1 * W2 * W3 + b1 * W2 * W3 + b2 * W3 + b3
= x * W' + b'
We can do the last step because combination of several linear transformation can be replaced with one transformation and combination of several bias term is just a single bias. The outcome is same even if we add some linear activation.
So we could replace this neural net with a single layer neural net.This can be extended to n
layers. This indicates adding layers doesn't increase the approximation power of a linear neural net at all. We need non-linear activation functions to approximate non-linear functions and most real world problems are highly complex and non-linear. In fact when the activation function is non-linear, then a two-layer neural network with sufficiently large number of hidden units can be proven to be a universal function approximator.