Custom loss function not improving with epochs

白昼怎懂夜的黑 提交于 2021-01-01 09:08:43

问题


I have created a custom loss function to deal with binary class imbalance, but my loss function does not improve per epoch. For metrics, I'm using precision and recall.

Is this a design issue where I'm not picking good hyper-parameters?

weights = [np.array([.10,.90]), np.array([.5,.5]), np.array([.1,.99]), np.array([.25,.75]), np.array([.35,.65])]
for weight in weights:
    print('Model with weights {a}'.format(a=weight))
    model = keras.models.Sequential([
    keras.layers.Flatten(), #input_shape=[X_train.shape[1]]
    keras.layers.Dense(32, activation='relu'),
    keras.layers.Dense(32, activation='relu'),
    keras.layers.Dense(1, activation='sigmoid')])
    model.compile(loss=weighted_loss(weight),metrics=[tf.keras.metrics.Precision(), tf.keras.metrics.Recall()])     
    
    n_epochs = 10
    history = model.fit(X_train.astype('float32'), y_train.values.astype('float32'), epochs=n_epochs, validation_data=(X_test.astype('float32'), y_test.values.astype('float32')), batch_size=64)   
    model.evaluate(X_test.astype('float32'), y_test.astype('float32'))
    pd.DataFrame(history.history).plot(figsize=(8, 5))
    plt.grid(True); plt.gca().set_ylim(0, 1); plt.show()

Custom loss function to deal with class imbalance issue:

def weighted_loss(weights):
    weights = K.variable(weights)            
    def loss(y_true, y_pred):
        y_pred /= K.sum(y_pred, axis=-1, keepdims=True)
        y_pred = K.clip(y_pred, K.epsilon(), 1 - K.epsilon())
        loss = y_true * K.log(y_pred) * weights
        loss = -K.sum(loss, -1)      
        return loss
    return loss

Output:

Model with weights [0.1 0.9]
Epoch 1/10
274/274 [==============================] - 1s 2ms/step - loss: 1.1921e-08 - precision_24: 0.1092 - recall_24: 0.4119 - val_loss: 1.4074e-08 - val_precision_24: 0.1247 - val_recall_24: 0.3953
Epoch 2/10
274/274 [==============================] - 0s 1ms/step - loss: 1.1921e-08 - precision_24: 0.1092 - recall_24: 0.4119 - val_loss: 1.4074e-08 - val_precision_24: 0.1247 - val_recall_24: 0.3953
Epoch 3/10
274/274 [==============================] - 0s 1ms/step - loss: 1.1921e-08 - precision_24: 0.1092 - recall_24: 0.4119 - val_loss: 1.4074e-08 - val_precision_24: 0.1247 - val_recall_24: 0.3953
Epoch 4/10
274/274 [==============================] - 0s 969us/step - loss: 1.1921e-08 - precision_24: 0.1092 - recall_24: 0.4119 - val_loss: 1.4074e-08 - val_precision_24: 0.1247 - val_recall_24: 0.3953
[...]

Image of the input data set and the true y variable class designation: Input Dataset a (17480 X 20) matrix:

y is the output array (2 classes) with dimensions (17480 x 1) and total number of 1's is: 1748 (the class that I want to predict)


回答1:


Since there is no MWE present it's rather difficult to be sure. In order to be as educative as possible I'll lay out some observations and remarks.

The first observation is that your custom loss function has really small values i.e. ~10e-8 throughout training. This seems to tell your model that performance is already really good while in fact, when looking at the metrics you chose, it isn't. This indicates that the problem resides near the output or has something to do with the loss function. My recommendation here is since you have a classification problem to have a look at this post regarding weighted cross-entropy [1].

Second observation is that it seems you don't have a benchmark for performance of your model. In general, ML workflow goes from very simple to complex models. I would recommend trying a simple Logistic Regression [2] to get an idea for minimal performance. After this I would try some more complex models such as tree booster (XGBoost/LightGBM/...) or a random forest. Especially considering you are using a full-blown neural network for tabular data with only about 20 numerical features that tends to still be in the traditional machine learning territory.

Once you have obtained a baseline and perhaps improved performance using a standard machine learning technique, you can look towards a neural network again. Some other recommendations depending on the results of the traditional approaches are:

  • Try several and optimizers and cross-validate them over different learning rates.

  • Try, as mentioned by @TyQuangTu, some simpler and shallower architectures.

  • Try an activation function that does not have the "dying neuron" problems such as LeakyRelu or ELU.

Hopefully this answer can help you and if you have any more questions I am glad to help.

[1] Unbalanced data and weighted cross entropy

[2] https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html



来源:https://stackoverflow.com/questions/65067446/custom-loss-function-not-improving-with-epochs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!