How to enforce monotonicity for (regression) model outputs in Keras?

问题

I am currently working on a problem where I provide a neural network with an input variable a, and another input x which is a monotonically increasing sequence of N numbers.

So my network would basically looks something like this:

a_input = Input(shape=[1], name='a')
x_input = Input(shape=[N], name='x')
nn = concatenate([a_input, x_input])
nn = Dense(100, activation='relu')(nn)
nn = Dense(N, activation='relu')(nn)
model = Model(inputs=[a_input, x_input], outputs=[nn])
model.compile(loss='mean_squared_error', optimizer="adam")

I perform a regression over the input space (where for each a the sequence x is unique), and I want the network to output a monotonically increasing sequence of (non-negative) N numbers for each set of inputs a and x.

Now, I have noticed that so far my outputs are not strictly speaking monotonic, but sort of look like they are if you 'zoom out'. By this I mean, for a given choice of a and x, if I want my output array to look like:

[0, 0.5, 0.51, 0.7, 0.75, 0.9, 1.],

I might instead get:

[0.001, 0.5, 0.48, 0.7, 0.75, 0.9, 1.].

Hence, I would like to know if there are standard ways, or specific tools already available in Keras, to constrain models to only output monotonically increasing sequences?

回答1:

To enforce non-negative outputs, use a non-negative activation such as ReLU or sigmoid in your output layer.

I am not aware of any neural method to enforce monotonicity in your output, but in my opinion a sensible approach would be to change the output representation to make the network predict the difference between two consecutive elements. For example, you could transform your output array:

a=[0, 0.5, 0.51, 0.7, 0.75, 0.9, 1.]

to:

b=[0, 0.5, 0.01, 0.19, 0.05, 0.15, 0.1]

with b[0] = a[0] and b[i] = a[i]-a[i-1] for i>0. Within this context, it would make sense to use a recurrent layer as the output layer, since each output unit now depends on the previous ones. Your original representation can be easily recovered as a[0] = b[0] and a[i] = b[i]+a[i-1] for i>0, and the resulting sequence will be monotonically increasing because each output b[i] is non-negative.

UPDATE 1. The LSTM should return the full sequence. You could try building the model as follows:

a_input = Input(shape=[1], name='a')
x_input = Input(shape=[N], name='x')
nn = concatenate([a_input, x_input])
nn = Dense(100, activation='relu')(nn)
nn = Dense(N, activation='relu')(nn)
nn = Lambda(lambda x: x[..., None])(nn)  # Output shape=(batch_size, nb_timesteps=N, input_dim=1)
nn = LSTM(1, return_sequences=True, activation='relu')(nn)  # Output shape=(batch_size, nb_timesteps=N, output_dim=1)
nn = Lambda(lambda x: keras.backend.squeeze(x, axis=-1))(nn)  # Output shape=(batch_size, N)
model = Model(inputs=[a_input, x_input], outputs=[nn])
model.compile(loss='mean_squared_error', optimizer="adam")

UPDATE 2. The LSTM with one hidden unit might not be powerful enough. I am not sure if this will help, but you could try adding another LSTM layer with more units (i.e. 10) before the last one:

...
nn = Lambda(lambda x: x[..., None])(nn)  # Output shape=(batch_size, nb_timesteps=N, input_dim=1)
nn = LSTM(10, return_sequences=True)(nn)  # Output shape=(batch_size, nb_timesteps=N, output_dim=10)
nn = LSTM(1, return_sequences=True, activation='relu')(nn)  # Output shape=(batch_size, nb_timesteps=N, output_dim=1)
nn = Lambda(lambda x: keras.backend.squeeze(x, axis=-1))(nn)  # Output shape=(batch_size, N)
...

来源：https://stackoverflow.com/questions/50823687/how-to-enforce-monotonicity-for-regression-model-outputs-in-keras

标签

python

keras

deep-learning

regression