Implementing a perceptron with backpropagation algorithm
I am trying to implement a two-layer perceptron with backpropagation to solve the parity problem. The network has 4 binary inputs, 4 hidden units in the first layer and 1 output in the second layer. I am using this for reference, but am having problems with convergence. First, I will note that I am using a sigmoid function for activation, and so the derivative is (from what I understand) the sigmoid(v) * (1 - sigmoid(v)). So, that is used when calculating the delta value. So, basically I set up the network and run for just a few epochs (go through each possible pattern -- in this case, 16