Based on PyBrain's tutorials I managed to knock together the following code:
#!/usr/bin/env python2
# coding: utf-8
from pybrain.structure import FeedForwardNetwork, LinearLayer, SigmoidLayer, FullConnection
from pybrain.datasets import SupervisedDataSet
from pybrain.supervised.trainers import BackpropTrainer
n = FeedForwardNetwork()
inLayer = LinearLayer(2)
hiddenLayer = SigmoidLayer(3)
outLayer = LinearLayer(1)
n.addInputModule(inLayer)
n.addModule(hiddenLayer)
n.addOutputModule(outLayer)
in_to_hidden = FullConnection(inLayer, hiddenLayer)
hidden_to_out = FullConnection(hiddenLayer, outLayer)
n.addConnection(in_to_hidden)
n.addConnection(hidden_to_out)
n.sortModules()
ds = SupervisedDataSet(2, 1)
ds.addSample((0, 0), (0,))
ds.addSample((0, 1), (1,))
ds.addSample((1, 0), (1,))
ds.addSample((1, 1), (0,))
trainer = BackpropTrainer(n, ds)
# trainer.train()
trainer.trainUntilConvergence()
print n.activate([0, 0])[0]
print n.activate([0, 1])[0]
print n.activate([1, 0])[0]
print n.activate([1, 1])[0]
It's supposed to learn XOR function, but the results seem quite random:
0.208884929522
0.168926515771
0.459452834043
0.424209192223
or
0.84956138664
0.888512762786
0.564964077401
0.611111147862
There are four problems with your approach, all easy to identify after reading Neural Network FAQ:
Why use a bias/threshold?: you should add a bias node. Lack of bias makes the learning very limited: the separating hyperplane represented by the network can only pass through the origin. With the bias node, it can move freely and fit the data better:
bias = BiasUnit() n.addModule(bias) bias_to_hidden = FullConnection(bias, hiddenLayer) n.addConnection(bias_to_hidden)Why not code binary inputs as 0 and 1?: all your samples lay in a single quadrant of the sample space. Move them to be scattered around the origin:
ds = SupervisedDataSet(2, 1) ds.addSample((-1, -1), (0,)) ds.addSample((-1, 1), (1,)) ds.addSample((1, -1), (1,)) ds.addSample((1, 1), (0,))(Fix the validation code at the end of your script accordingly.)
trainUntilConvergencemethod works using validation, and does something that resembles the early stopping method. This doesn't make sense for such a small dataset. UsetrainEpochsinstead.1000epochs is more than enough for this problem for the network to learn:trainer.trainEpochs(1000)What learning rate should be used for backprop?: Tune the learning rate parameter. This is something you do every time you employ a neural network. In this case, the value
0.1or even0.2dramatically increases the learning speed:trainer = BackpropTrainer(n, dataset=ds, learningrate=0.1, verbose=True)(Note the
verbose=Trueparameter. Observing how the error behaves is essential when tuning parameters.)
With these fixes I get consistent, and correct results for the given network with the given dataset, and error less than 1e-23.
来源:https://stackoverflow.com/questions/32655573/how-to-create-simple-3-layer-neural-network-and-teach-it-using-supervised-learni