How to save Scikit-Learn-Keras Model into a Persistence File (pickle/hd5/json/yaml)

后端 未结 5 838
小鲜肉
小鲜肉 2020-12-13 19:23

I have the following code, using Keras Scikit-Learn Wrapper:

from keras.models import Sequential
from sklearn import datasets
from keras.layers import Dense
         


        
相关标签:
5条回答
  • 2020-12-13 19:47

    Another great alternative is to use callbacks when you fit your model. Specifically the ModelCheckpoint callback, like this:

    from keras.callbacks import ModelCheckpoint
    #Create instance of ModelCheckpoint
    chk = ModelCheckpoint("myModel.h5", monitor='val_loss', save_best_only=False)
    #add that callback to the list of callbacks to pass
    callbacks_list = [chk]
    #create your model
    model_tt = KerasClassifier(build_fn=create_model, nb_epoch=150, batch_size=10)
    #fit your model with your data. Pass the callback(s) here
    model_tt.fit(X_train,y_train, callbacks=callbacks_list)
    

    This will save your training each epoch to the myModel.h5 file. This provides great benefits, as you are able to stop your training when you desire (like when you see it has started to overfit), and still retain the previous training.

    Note that this saves both the structure and weights in the same hdf5 file (as showed by Zach), so you can then load you model using keras.models.load_model.

    If you want to save only your weights separately, you can then use the save_weights_only=True argument when instantiating your ModelCheckpoint, enabling you to load your model as explained by Gaarv. Extracting from the docs:

    save_weights_only: if True, then only the model's weights will be saved (model.save_weights(filepath)), else the full model is saved (model.save(filepath)).

    0 讨论(0)
  • 2020-12-13 19:49

    The accepted answer is too complicated. You can fully save and restore every aspect of your model in a .h5 file. Straight from the Keras FAQ:

    You can use model.save(filepath) to save a Keras model into a single HDF5 file which will contain:

    • the architecture of the model, allowing to re-create the model
    • the weights of the model
    • the training configuration (loss, optimizer)
    • the state of the optimizer, allowing to resume training exactly where you left off.

    You can then use keras.models.load_model(filepath) to reinstantiate your model. load_model will also take care of compiling the model using the saved training configuration (unless the model was never compiled in the first place).

    And the corresponding code:

    from keras.models import load_model
    
    model.save('my_model.h5')  # creates a HDF5 file 'my_model.h5'
    del model  # deletes the existing model
    
    # returns a compiled model identical to the previous one
    model = load_model('my_model.h5')
    
    0 讨论(0)
  • 2020-12-13 19:51

    In case your keras wrapper model is in a scikit pipeline, you save steps in the pipeline separately.

    import joblib
    from sklearn.pipeline import Pipeline
    from tensorflow import keras
    
    # pass the create_cnn_model function into wrapper
    cnn_model = keras.wrappers.scikit_learn.KerasClassifier(build_fn=create_cnn_model)
    
    # create pipeline
    cnn_model_pipeline_estimator = Pipeline([
        ('preprocessing_pipeline', pipeline_estimator),
        ('clf', cnn_model)
    ])
    
    # train model
    final_model = cnn_model_pipeline_estimator.fit(
    X, y, clf__batch_size=32, clf__epochs=15)
    
    # collect the preprocessing pipeline & model seperately
    pipeline_estimator = final_model.named_steps['preprocessing_pipeline']
    clf = final_model.named_steps['clf']
    
    # store pipeline and model seperately
    joblib.dump(pipeline_estimator, open('path/to/pipeline.pkl', 'wb'))
    clf.model.save('path/to/model.h5')
    
    # load pipeline and model
    pipeline_estimator = joblib.load('path/to/pipeline.pxl')
    model = keras.models.load_model('path/to/model.h5')
    
    new_example = [[...]]
    
    # transform new data with pipeline & use model for prediction
    transformed_data = pipeline_estimator.transform(new_example)
    prediction = model.predict(transformed_data)
    
    0 讨论(0)
  • 2020-12-13 19:52

    Edit 1 : Original answer about saving model

    With HDF5 :

    # saving model
    json_model = model_tt.model.to_json()
    open('model_architecture.json', 'w').write(json_model)
    # saving weights
    model_tt.model.save_weights('model_weights.h5', overwrite=True)
    
    
    # loading model
    from keras.models import model_from_json
    
    model = model_from_json(open('model_architecture.json').read())
    model.load_weights('model_weights.h5')
    
    # dont forget to compile your model
    model.compile(loss='binary_crossentropy', optimizer='adam')
    

    Edit 2 : full code example with iris dataset

    # Train model and make predictions
    import numpy
    import pandas
    from keras.models import Sequential, model_from_json
    from keras.layers import Dense
    from keras.utils import np_utils
    from sklearn import datasets
    from sklearn import preprocessing
    from sklearn.model_selection import train_test_split
    from sklearn.preprocessing import LabelEncoder
    
    # fix random seed for reproducibility
    seed = 7
    numpy.random.seed(seed)
    
    # load dataset
    iris = datasets.load_iris()
    X, Y, labels = iris.data, iris.target, iris.target_names
    X = preprocessing.scale(X)
    
    # encode class values as integers
    encoder = LabelEncoder()
    encoder.fit(Y)
    encoded_Y = encoder.transform(Y)
    
    # convert integers to dummy variables (i.e. one hot encoded)
    y = np_utils.to_categorical(encoded_Y)
    
    def build_model():
        # create model
        model = Sequential()
        model.add(Dense(4, input_dim=4, init='normal', activation='relu'))
        model.add(Dense(3, init='normal', activation='sigmoid'))
        model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
        return model
    
    def save_model(model):
        # saving model
        json_model = model.to_json()
        open('model_architecture.json', 'w').write(json_model)
        # saving weights
        model.save_weights('model_weights.h5', overwrite=True)
    
    def load_model():
        # loading model
        model = model_from_json(open('model_architecture.json').read())
        model.load_weights('model_weights.h5')
        model.compile(loss='categorical_crossentropy', optimizer='adam')
        return model
    
    
    X_train, X_test, Y_train, Y_test = train_test_split(X, y, test_size=0.3, random_state=seed)
    
    # build
    model = build_model()
    model.fit(X_train, Y_train, nb_epoch=200, batch_size=5, verbose=0)
    
    # save
    save_model(model)
    
    # load
    model = load_model()
    
    # predictions
    predictions = model.predict_classes(X_test, verbose=0)
    print(predictions)
    # reverse encoding
    for pred in predictions:
        print(labels[pred])
    

    Please note that I used Keras only, not the wrapper. It only add some complexity in something simple. Also code is volontary not factored so you can have the whole picture.

    Also, you said you want to output 1 or 0. It is not possible in this dataset because you have 3 output dims and classes (Iris-setosa, Iris-versicolor, Iris-virginica). If you had only 2 classes then your output dim and classes would be 0 or 1 using sigmoid output fonction.

    0 讨论(0)
  • 2020-12-13 19:56

    Just adding to gaarv's answer - If you don't require the separation between the model structure (model.to_json()) and the weights (model.save_weights()), you can use one of the following:

    • Use the built-in keras.models.save_model and 'keras.models.load_model` that store everything together in a hdf5 file.
    • Use pickle to serialize the Model object (or any class that contains references to it) into file/network/whatever..
      Unfortunetaly, Keras doesn't support pickle by default. You can use my patchy solution that adds this missing feature. Working code is here: http://zachmoshe.com/2017/04/03/pickling-keras-models.html
    0 讨论(0)
提交回复
热议问题