Can the sigmoid activation function be used to solve regression problems in Keras?

问题

I have implemented simple neural networks with R but it is my first time doing so with Keras so would appreciate some advice.

I developed a neural network function in Keras to predict car sales (the dataset is available here). CarSales is the dependent variable.

As far as I'm aware, Keras is used to develop a neural network for classification purposes rather than regression. In all the examples I have seen so far, the output is bounded between 0 and 1.

Here is the code I developed, and you will see that I am using the 'sigmoid' function for the output:

from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense
from tensorflow.python.keras.wrappers.scikit_learn import KerasRegressor
import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import MinMaxScaler

import os;
path="C:/Users/cars.csv"
os.chdir(path)
os.getcwd()

#Variables
dataset=np.loadtxt("cars.csv", delimiter=",")
x=dataset[:,0:5]
y=dataset[:,5]
y=np.reshape(y, (-1,1))
scaler = MinMaxScaler()
print(scaler.fit(x))
print(scaler.fit(y))
xscale=scaler.transform(x)
yscale=scaler.transform(y)

model = Sequential()
model.add(Dense(12, input_dim=5, kernel_initializer='normal', activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.summary()

model.compile(loss='mse', optimizer='adam', metrics=['mse','mae','mape','cosine','accuracy'])
model.fit(xscale, yscale, epochs=150, batch_size=50,  verbose=1, validation_split=0.2)

As you can see, I am using the MaxMinScaler to bound the variables, and thus the output, between 0 and 1.

When I generate 150 Epochs, values such as the mean_squared_error and mean_absolute_error are quite low. However, the mean_absolute_percentage_error is quite high - but I suspect that this is not a good metric to use when evaluating a sigmoid output.

Is bounding the output variable between 0 and 1 and then running the model an acceptable way of trying to predict an interval variable using a neural network?

回答1:

Is bounding the output variable between 0 and 1 and then running the model an acceptable way of trying to predict an interval variable using a neural network?

I suppose that can work if you know the range of values that your output can take in advance. It's certainly not common though.

With the following code, you're essentially cheating a bit. You're using all data (training and validation) to compute your bounds for the scaler, whereas only training data should be used.

dataset=np.loadtxt("cars.csv", delimiter=",")
x=dataset[:,0:5]
y=dataset[:,5]
y=np.reshape(y, (-1,1))
scaler = MinMaxScaler()
print(scaler.fit(x))
print(scaler.fit(y))
xscale=scaler.transform(x)
yscale=scaler.transform(y)

If you don't cheat like that though, you may get values in the validation data that exceed your bounds. If you then still use a sigmoid, you'll be unable to make correct predictions (which should lie outside [0, 1] if scaled according to bounds determined by training data only).

It's much more common to simply end with a linear layer for regression tasks, like Hemen suggested.

Your learning process may still benefit from scaling outputs in the training data to [0, 1], but then outputs outside training data could, for example, get mapped to 1.1 if they slightly exceed all values observed in training data.

回答2:

To perform regression using neural network you should use linear activation function in the final output.

Try following code.

model = Sequential()
model.add(Dense(12, input_dim=5, kernel_initializer='normal', activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='linear'))
model.summary()

来源：https://stackoverflow.com/questions/48976413/can-the-sigmoid-activation-function-be-used-to-solve-regression-problems-in-kera

标签

python

tensorflow

neural-network

keras