Keras' fit_generator() for binary classification predictions always 50%

问题

I have set up a model to train on classifying whether an image is a certain video game or not. I pre-scaled my images into 250x250 pixels and have them separated into two folders (the two binary classes) labelled 0 and 1. The amount of both classes are within ~100 of each other and I have around 3500 images in total.

Here are photos of the training process, the model set up and some predictions: https://imgur.com/a/CN1b6LV

train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0,
    zoom_range=0,
    horizontal_flip=True,
    width_shift_range=0.1,
    height_shift_range=0.1,
    validation_split=0.2)
train_generator = train_datagen.flow_from_directory(
    'data\\',
    batch_size=batchsize,
    shuffle=True,
    target_size=(250, 250),
    subset="training",
    class_mode="binary")
val_generator = train_datagen.flow_from_directory(
    'data\\',
    batch_size=batchsize,
    shuffle=True,
    target_size=(250, 250),
    subset="validation",
    class_mode="binary")
pred_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0,
    zoom_range=0,
    horizontal_flip=False,
    width_shift_range=0.1,
    height_shift_range=0.1)
pred_generator = pred_datagen.flow_from_directory(
    'batch_pred\\',
    batch_size=30,
    shuffle=False,
    target_size=(250, 250))


model = Sequential()
model.add(Conv2D(input_shape=(250, 250, 3), filters=25, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=32, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=32, kernel_size=3, activation="relu", padding="same"))
model.add(MaxPooling2D(pool_size=2,  padding="same", strides=(2, 2)))
model.add(BatchNormalization())
model.add(Conv2D(filters=64, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=64, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=64, kernel_size=3, activation="relu", padding="same"))
model.add(MaxPooling2D(pool_size=2, padding="same", strides=(2, 2)))
model.add(BatchNormalization())
model.add(Conv2D(filters=128, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=128, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=128, kernel_size=3, activation="relu", padding="same"))
model.add(MaxPooling2D(pool_size=2, padding="same", strides=(2, 2)))
model.add(Conv2D(filters=256, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=256, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=256, kernel_size=3, activation="relu", padding="same"))
model.add(MaxPooling2D(pool_size=2, padding="same", strides=(2, 2)))
model.add(BatchNormalization())
dense = False
if dense:
    model.add(Flatten())
    model.add(Dense(250, activation="relu"))
    model.add(BatchNormalization())
    model.add(Dense(50, activation="relu"))
else:
    model.add(GlobalAveragePooling2D())
model.add(Dense(1, activation="softmax"))
model.compile(loss='binary_crossentropy',
              optimizer=Adam(0.0005), metrics=["acc"])
callbacks = [EarlyStopping(monitor='val_acc', patience=200, verbose=1),
             ModelCheckpoint(filepath="model_checkpoint.h5py",
                             monitor='val_acc', save_best_only=True, verbose=1)]
model.fit_generator(
      train_generator,
      steps_per_epoch=train_generator.samples // batchsize,
      validation_data=val_generator,
      validation_steps=val_generator.samples // batchsize,
      epochs=500,
      callbacks=callbacks)

Everything appears to run correctly in terms of the model iterating the data by epoch, it finding the correct number of images etc. However, my predictions are always 50% despite a good validation accuracy, low loss, high accuracy etc.

I'm not sure what I'm doing wrong and any help would be appreciated.

回答1:

I think your problem is that you're using sigmoid for binary classification, your final layer activation function should be linear.

回答2:

The problem is that you are using softmax on a Dense layer with one unit. Softmax function normalizes its input such that the sum of its elements becomes equal to one. So if it has one unit, then the output would be always 1. Instead, for binary classification you need to use sigmoid function as the activation function of last layer.

来源：https://stackoverflow.com/questions/53311885/keras-fit-generator-for-binary-classification-predictions-always-50

标签

python

tensorflow

machine-learning

keras

classification