If Keras results are not reproducible, what's the best practice for comparing models and choosing hyper parameters?

后端 未结 4 1852
日久生厌
日久生厌 2020-12-06 14:50

UPDATE: This question was for Tensorflow 1.x. I upgraded to 2.0 and (at least on the simple code below) the reproducibility issue seems fixed on 2.0. So that solves my p

相关标签:
4条回答
  • 2020-12-06 15:26

    Putting only the code of below, it works. The KEY of the question, VERY IMPORTANT, is to call the function reset_seeds() every time before running the model. Doing that you will obtain reproducible results as I checked in the Google Collab.

    import numpy as np
    import tensorflow as tf
    import random as python_random
    
    def reset_seeds():
       np.random.seed(123) 
       python_random.seed(123)
       tf.random.set_seed(1234)
    
    reset_seeds() 
    
    0 讨论(0)
  • 2020-12-06 15:27

    It's sneaky, but your code does, in fact, lack a step for better reproducibility: resetting the Keras & TensorFlow graphs before each run. Without this, tf.set_random_seed() won't work properly - see correct approach below.

    I'd exhaust all the options before tossing the towel on non-reproducibility; currently I'm aware of only one such instance, and it's likely a bug. Nonetheless, it's possible you'll get notably differing results even if you follow through all the steps - in that case, see "If nothing works", but each is clearly not very productive, thus it's best on focusing attaining reproducibility:

    Definitive improvements:

    • Use reset_seeds(K) below
    • Increase numeric precision: K.set_floatx('float64')
    • Set PYTHONHASHSEED before the Python kernel starts - e.g. from terminal
    • Upgrade to TF 2, which includes some reproducibility bug fixes, but mind performance
    • Run CPU on a single thread (painfully slow)
    • Do not import from tf.python.keras - see here
    • Ensure all imports are consistent (i.e. don't do from keras.layers import ... and from tensorflow.keras.optimizers import ...)
    • Use a superior CPU - for example, Google Colab, even if using GPU, is much more robust against numeric imprecision - see this SO

    Also see related SO on reproducibility


    If nothing works:

    • Rerun X times w/ exact same hyperparameters & seeds, average results
    • K-Fold Cross-Validation w/ exact same hyperparameters & seeds, average results - superior option, but more work involved

    Correct reset method:

    def reset_seeds(reset_graph_with_backend=None):
        if reset_graph_with_backend is not None:
            K = reset_graph_with_backend
            K.clear_session()
            tf.compat.v1.reset_default_graph()
            print("KERAS AND TENSORFLOW GRAPHS RESET")  # optional
    
        np.random.seed(1)
        random.seed(2)
        tf.compat.v1.set_random_seed(3)
        print("RANDOM SEEDS RESET")  # optional
    

    Running TF on single CPU thread: (code for TF1-only)

    session_conf = tf.ConfigProto(
          intra_op_parallelism_threads=1,
          inter_op_parallelism_threads=1)
    sess = tf.Session(config=session_conf)
    
    0 讨论(0)
  • 2020-12-06 15:28

    You have a couple option for stabilizing performance...

    1) Set the seed for your intializers so they are always initialized to the same values.

    2) More data generally results in a more stable convergence.

    3) Lower learning rates and bigger batch sizes are also good for more predictable learning.

    4) Training based on a fixed amount of epochs instead of using callbacks to modify hyperparams during train.

    5) K-fold validation to train on different subsets. The average of these folds should result in a fairly predictable metric.

    6) Also you have the option of just training multiple times and taking an average of this.

    0 讨论(0)
  • 2020-12-06 15:35

    The problem appears to be solved in Tensorflow 2.0 (at least on simple models)! Here is a code snippet that seems to yield repeatable results.

    import os
    ####*IMPORANT*: Have to do this line *before* importing tensorflow
    os.environ['PYTHONHASHSEED']=str(1)
    
    import tensorflow as tf
    import tensorflow.keras as keras
    import tensorflow.keras.layers 
    import random
    import pandas as pd
    import numpy as np
    
    def reset_random_seeds():
       os.environ['PYTHONHASHSEED']=str(1)
       tf.random.set_seed(1)
       np.random.seed(1)
       random.seed(1)
    
    #make some random data
    reset_random_seeds()
    NUM_ROWS = 1000
    NUM_FEATURES = 10
    random_data = np.random.normal(size=(NUM_ROWS, NUM_FEATURES))
    df = pd.DataFrame(data=random_data, columns=['x_' + str(ii) for ii in range(NUM_FEATURES)])
    y = df.sum(axis=1) + np.random.normal(size=(NUM_ROWS))
    
    def run(x, y):
        reset_random_seeds()
    
        model = keras.Sequential([
                keras.layers.Dense(40, input_dim=df.shape[1], activation='relu'),
                keras.layers.Dense(20, activation='relu'),
                keras.layers.Dense(10, activation='relu'),
                keras.layers.Dense(1, activation='linear')
            ])
        NUM_EPOCHS = 500
        model.compile(optimizer='adam', loss='mean_squared_error')
        model.fit(x, y, epochs=NUM_EPOCHS, verbose=0)
        predictions = model.predict(x).flatten()
        loss = model.evaluate(x,  y) #This prints out the loss by side-effect
    
    #With Tensorflow 2.0 this is now reproducible! 
    run(df, y)
    run(df, y)
    run(df, y)
    
    0 讨论(0)
提交回复
热议问题