Trained TensorFlow model always outputs zero

和自甴很熟 提交于 2019-12-24 10:58:10

问题


I am training an autonomous driving convolutional neural network in TensorFlow. It is a simple regression network that takes an image and outputs a single value (a steering angle).

This is the function in which the network is defined:

def cnn_model_fn(features, labels, mode):
    conv1 = tf.layers.conv2d(
        inputs=features,
        filters=32,
        kernel_size=5,
        padding="same",
        activation=tf.nn.relu
    )

    pool1 = tf.layers.max_pooling2d(
        inputs=conv1,
        pool_size=2,
        strides=2
    )

    pool1_flat = tf.reshape(pool1, [-1, 2764800])

    dense1 = tf.layers.dense(
        inputs=pool1_flat,
        units=128,
        activation=tf.nn.relu
    )

    dropout = tf.layers.dropout(
        inputs=dense1,
        rate=0.4,
        training=mode == learn.ModeKeys.TRAIN
    )

    dense2 = tf.layers.dense(
        inputs=dropout,
        units=1,
        activation=tf.nn.relu
    )

    predictions = tf.reshape(dense2, [-1])

    loss = None
    train_op = None

    if mode != learn.ModeKeys.INFER:
        loss = tf.losses.mean_squared_error(
            labels=labels,
            predictions=predictions
        )

    if mode == learn.ModeKeys.TRAIN:
        train_op = tf.contrib.layers.optimize_loss(
            loss=loss,
            global_step=tf.contrib.framework.get_global_step(),
            learning_rate=0.001,
            optimizer="SGD"
        )

    return model_fn_lib.ModelFnOps(
        mode=mode,
        predictions=predictions,
        loss=loss,
        train_op=train_op
    )

Elsewhere in the program, I initiate the classifier's training like so:

def main(_):
    # Gather data
    images, labels = get_data("./data/labels.csv")

    # Create the estimator
    classifier = learn.Estimator(
        model_fn=cnn_model_fn,
        model_dir="/tmp/network2"
    )

    # Train the model
    classifier.fit(
        x=images,
        y=labels,
        batch_size=10,
        steps=20
    )

    for v in tf.trainable_variables():
        print(v)

labels is a simple one-dimensional NumPy containing all of the steering angles for the training examples. They are being read from a CSV file. The values in the file are quite close to 0, and have an average around zero.

When they are read directly from the file, or multiplied by a scalar, the network converges reasonably well, and achieves a low loss function. When I add a constant, it fails to converge or diverges. I have a suspicion that all of the weights of the network are converging on zero.

Does anybody see any problems with my methodology?


回答1:


That dropout regularization might be the culprit:

dropout = tf.layers.dropout(
        inputs=dense1,
        rate=0.4,
        training=mode == learn.ModeKeys.TRAIN
    )

What you are describing, the weights failing to adequately converge or falling close to zero, is highly descriptive of a high-bias problem. Removing or reducing the degree of regularization, adding more parameters to your network, or otherwise increasing the variance are common ways to fix this type of problem.



来源:https://stackoverflow.com/questions/44932870/trained-tensorflow-model-always-outputs-zero

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!