I need to train a network to multiply or add 2 inputs, but it doesn\'t seem to approximate well for all points after 20000 iterations. More specifically, I train it on the whole
Think about what would happen if you replaced your tanh(x)
threshold function with a linear function of x - call it a.x
- and treat a
as the sole learning parameter in each neuron. That's effectively what your network will be optimising towards; it's an approximation of the zero-crossing of the tanh
function.
Now, what happens when you layer neurons of this linear type? You multiply the output of each neuron as the pulse goes from input to output. You're trying to approximate addition with a set of multiplications. That, as they say, does not compute.