Problems while using ScikitLearn's Neural Network implementation

问题

I am trying to implement image processing using Neural Network implementation given by Scikit Learn. I have close to 10,000 color images in 'JPG' format, I converted those images into 'PNG' format and removed the color information. The new images are all Black OR White images. After converting these images into vector format, these image vectors formed the input to the Neural Network.

To each image, there is an output as well which forms the output of the Neural Network.

The input file only has values of 0's and 1's and nothing else at all. The output for each image corresponds to a vector which is continuous, between 0 and 1 and is 22 in length. i.e. each image's output is a vector with length 22.

To start off with the processing, I began with only 100 images and their corresponding outputs and got the following error:

ValueError: Array contains NaN or infinity

I would also like to point out that the first iteration was completed here and I encountered this error during the second iteration.

To try something different, I trimmed my input and output down to 10 images each. Using the same piece of code (coming up shortly), I was able to complete 7 iterations (I had set the number of iterations to 20) and then received the same error.

I then changed the number of iterations to 5, just to check if it works. After this change, I got the following error:

ValueError: bad input shape (10, 22)

I also tried to use np.reval() on my input and output but that gave me NaN or Infinity error again.

Here is the code I am using for the whole process:

import numpy as np
import csv
import matplotlib.pyplot as plt
from scipy.ndimage import convolve
from sklearn import linear_model, datasets, metrics
from sklearn.cross_validation import train_test_split
from sklearn.neural_network import BernoulliRBM
from sklearn.pipeline import Pipeline


def ReadCsv(fileName):
    in_file = open(fileName, 'rUb')
    reader = csv.reader(in_file, delimiter=',', quotechar='"')
    data = [[]]
    for row in reader:
        data.append(row)

    data.pop(0)
    return data

X_train = np.asarray(ReadCsv('100Images.csv'), 'float32')
Y_train = np.asarray(ReadCsv('100Images_Y_new.csv'), 'float32')
X_test = np.asarray(ReadCsv('ImagesForTest.csv'), 'float32')
Y_test = np.asarray(ReadCsv('ImagesForTest_Y_new.csv'), 'float32')

logistic = linear_model.LogisticRegression()
rbm = BernoulliRBM(random_state=0, verbose=True)

classifier = Pipeline(steps=[('rbm', rbm), ('logistic', logistic)])

rbm.learning_rate = 0.06
rbm.n_iter = 5

rbm.n_components = 100
logistic.C = 6000.0

classifier.fit(X_train, Y_train)

print()
print("Logistic regression using RBM features:\n%s\n" % (
    metrics.classification_report(
        Y_test,
        classifier.predict(X_test))))

I would really appreciate any kind of help on this issue.

TIA.