What are the pitfalls of using Dill to serialise scikit-learn/statsmodels models?

前端 未结 3 534
灰色年华
灰色年华 2021-01-31 09:24

I need to serialise scikit-learn/statsmodels models such that all the dependencies (code + data) are packaged in an artefact and this artefact can be used to initialise the mode

3条回答
  •  青春惊慌失措
    2021-01-31 10:08

    Ok to begin with, in your sample code pickle could work fine, I use pickle all the time to package a model and use it later, unless you want to send the model directly to another server or save the interpreter state, because that is what Dill is good at and pickle can not do. It also depends on your code, what types etc. you use, pickle might fail, Dill is more stable.

    Dill is primarly based on pickle and so they are very similar, some things you should take into account / look into:

    1. Limitations of Dill

      frame, generator, traceback standard types can not be packaged.

    2. cloudpickle might be a good idea for your problem as well, it has better support in pickling objects (than pickle, not per see better than Dill) and you can pickle code easily as well.

    Once the target machine has the correct libraries loaded, (be carefull for different python versions as well, because they may bug your code), everything should work fine with both Dill and cloudpickle, as long as you do not use the unsuported standard types.

    Hope this helps.

提交回复
热议问题