Results not reproducible with Keras and TensorFlow in Python

前端未结

关注

 4  1703

春和景丽

I have the problem, that I am not able to reproduce my results with Keras and ThensorFlow.

It seems like recently there has been a workaround published on the Keras

相关标签:

4条回答

走了就别回头了

2020-12-16 19:06

Looks like a bug in TensorFlow / Keras not sure. When setting the Keras back-end to CNTK the results are reproducible.

I even tried with several versions of TensorFlow from 1.2.1 till 1.13.1. All the TensorFlow versions results doesn't agree with multiple runs even when the random seeds are set.

0 讨论(0)
发布评论:

提交评论
- 加载中...
渐次进展

2020-12-16 19:08
My answer is the following, which uses Keras with Tensorflow as backend. Within your nested for loop, where one typically iterates through the various parameters you wish to explore for your model's development, immediately add this function after your last for loop.
```
for...
   for...
      reset_keras()
      .
      .
      .
```
where the reset function is defined as
```
def reset_keras():
    sess = tf.keras.backend.get_session()
    tf.keras.backend.clear_session()
    sess.close()
    sess = tf.keras.backend.get_session()
    np.random.seed(1)
    tf.set_random_seed(2)
```
PS: The function above also actually avoids your nvidia GPU from building up too much memory (which happens after many iteration) so that it eventually becomes very slow...so the function restores GPU performance and maintains results as reproducible.
0 讨论(0)
发布评论:

提交评论
- 加载中...

春和景丽

2020-12-16 19:10

I had exactly the same problem and managed to solve it by closing and restarting the tensorflow session every time I run the model. In your case it should look like this:

#START A NEW TF SESSION
np.random.seed(0)
tf.set_random_seed(0)
sess = tf.Session(graph=tf.get_default_graph())
K.set_session(sess)

embedding_vecor_length = 32
neurons = 91
epochs = 1
model = Sequential()
model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length))
model.add(LSTM(neurons))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_logarithmic_error', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(X_train, y_train, epochs=epochs, batch_size=64)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

#CLOSE TF SESSION
K.clear_session()

I ran the following code and had reproducible results using GPU and tensorflow backend:

print datetime.now()
for i in range(10):
    np.random.seed(0)
    tf.set_random_seed(0)
    sess = tf.Session(graph=tf.get_default_graph())
    K.set_session(sess)

    n_classes = 3
    n_epochs = 20
    batch_size = 128

    task = Input(shape = x.shape[1:])
    h = Dense(100, activation='relu', name='shared')(task)
    h1= Dense(100, activation='relu', name='single1')(h)
    output1 = Dense(n_classes, activation='softmax')(h1)

    model = Model(task, output1)
    model.compile(loss='categorical_crossentropy', optimizer='Adam')
    model.fit(x_train, y_train_onehot, batch_size = batch_size, epochs=n_epochs, verbose=0)
print(model.evaluate(x=x_test, y=y_test_onehot, batch_size=batch_size, verbose=0))
K.clear_session()

And obtained this output:

2017-10-23 11:27:14.494482
0.489712882132
0.489712893813
0.489712892765
0.489712854426
0.489712882132
0.489712864011
0.486303713004
0.489712903398
0.489712892765
0.489712903398

What I understood is that if you don't close your tf session (you are doing it by running in a new kernel) you keep sampling the same "seeded" distribution.

0 讨论(0)

小鲜肉

2020-12-16 19:16

The thing that worked for me was to run the training every time in a new console. In addition to this I also have this parameters set:

RANDOM_STATE = 42

os.environ['PYTHONHASHSEED'] = str(RANDOM_STATE)
random.seed(RANDOM_STATE)
np.random.seed(RANDOM_STATE)
tf.set_random_seed(RANDOM_STATE)

session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)

intra_op_parallelism could also be a bigger value

0 讨论(0)