Most of the problems I have seen were solved with 1-2 hidden layers. It is proven that MLPs with only one hidden layer are universal function approximators (Hornik et. al.). More hidden layers can make the problem easier or harder. You usually have to try different topologies. I heard that you cannot add an arbitrary number of hidden layers if you want to train your MLP with backprop because the gradient will become too small in the first layers (I have no reference for that). But there are some applications where people used up to nine layers. Maybe you are interested in a standard benchmark problem which is solved by different classifiers and MLP topologies.