问题
I am currently working on a problem where I provide a neural network with an input variable a, and another input x which is a monotonically increasing sequence of N numbers.
So my network would basically looks something like this:
a_input = Input(shape=[1], name='a')
x_input = Input(shape=[N], name='x')
nn = concatenate([a_input, x_input])
nn = Dense(100, activation='relu')(nn)
nn = Dense(N, activation='relu')(nn)
model = Model(inputs=[a_input, x_input], outputs=[nn])
model.compile(loss='mean_squared_error', optimizer="adam")
I perform a regression over the input space (where for each a the sequence x is unique), and I want the network to output a monotonically increasing sequence of (non-negative) N numbers for each set of inputs a and x.
Now, I have noticed that so far my outputs are not strictly speaking monotonic, but sort of look like they are if you 'zoom out'. By this I mean, for a given choice of a and x, if I want my output array to look like:
[0, 0.5, 0.51, 0.7, 0.75, 0.9, 1.],
I might instead get:
[0.001, 0.5, 0.48, 0.7, 0.75, 0.9, 1.].
Hence, I would like to know if there are standard ways, or specific tools already available in Keras, to constrain models to only output monotonically increasing sequences?
回答1:
To enforce non-negative outputs, use a non-negative activation such as ReLU or sigmoid in your output layer.
I am not aware of any neural method to enforce monotonicity in your output, but in my opinion a sensible approach would be to change the output representation to make the network predict the difference between two consecutive elements. For example, you could transform your output array:
a=[0, 0.5, 0.51, 0.7, 0.75, 0.9, 1.]
to:
b=[0, 0.5, 0.01, 0.19, 0.05, 0.15, 0.1]
with b[0] = a[0]
and b[i] = a[i]-a[i-1]
for i>0
. Within this context, it would make sense to use a recurrent layer as the output layer, since each output unit now depends on the previous ones. Your original representation can be easily recovered as a[0] = b[0]
and a[i] = b[i]+a[i-1]
for i>0
, and the resulting sequence will be monotonically increasing because each output b[i]
is non-negative.
UPDATE 1. The LSTM should return the full sequence. You could try building the model as follows:
a_input = Input(shape=[1], name='a')
x_input = Input(shape=[N], name='x')
nn = concatenate([a_input, x_input])
nn = Dense(100, activation='relu')(nn)
nn = Dense(N, activation='relu')(nn)
nn = Lambda(lambda x: x[..., None])(nn) # Output shape=(batch_size, nb_timesteps=N, input_dim=1)
nn = LSTM(1, return_sequences=True, activation='relu')(nn) # Output shape=(batch_size, nb_timesteps=N, output_dim=1)
nn = Lambda(lambda x: keras.backend.squeeze(x, axis=-1))(nn) # Output shape=(batch_size, N)
model = Model(inputs=[a_input, x_input], outputs=[nn])
model.compile(loss='mean_squared_error', optimizer="adam")
UPDATE 2. The LSTM with one hidden unit might not be powerful enough. I am not sure if this will help, but you could try adding another LSTM layer with more units (i.e. 10) before the last one:
...
nn = Lambda(lambda x: x[..., None])(nn) # Output shape=(batch_size, nb_timesteps=N, input_dim=1)
nn = LSTM(10, return_sequences=True)(nn) # Output shape=(batch_size, nb_timesteps=N, output_dim=10)
nn = LSTM(1, return_sequences=True, activation='relu')(nn) # Output shape=(batch_size, nb_timesteps=N, output_dim=1)
nn = Lambda(lambda x: keras.backend.squeeze(x, axis=-1))(nn) # Output shape=(batch_size, N)
...
来源:https://stackoverflow.com/questions/50823687/how-to-enforce-monotonicity-for-regression-model-outputs-in-keras