Is a Neural network with 2 input nodes, 2 hidden nodes and an output supposed to be able to solve the XOR problem provided there is no bias? Or can it get stuck?
Yes, you can if you use an activation function like Relu (f(x) =max(0,x))
Example of weights of such network are:
Layer1: [[-1, 1], [1, -1]]
Layer2: [[1], [1]]
For the first (hidden) layer:
For the second (output) layer: Since the weights are [[1], [1]] (and there can be no negative activations from previous layer due to ReLU), the layer simply acts as a summation of activations in layer 1
While this method coincidentally works in the example above, it is limited to using zero (0) label for False examples of the XOR problem. If, for example, we used ones for False examples and twos for True examples, this approach would not work anymore.