Spark ML - Save OneVsRestModel

后端 未结 1 1444
庸人自扰
庸人自扰 2020-12-10 08:08

I am in the middle of refactoring my code to take advantage of DataFrames, Estimators, and Pipelines. I was originally using MLlib Multiclass LogisticRegressionWithLBFGS on

相关标签:
1条回答
  • 2020-12-10 08:48

    Spark 2.0.0

    OneVsRestModel implements MLWritable so it should be possible to save it directly. Method shown below can be still useful to save individual models separately.

    Spark < 2.0.0

    The problem here is that models returns an Array of ClassificationModel[_, _]] not an Array of LogisticRegressionModel (or MLWritable). To make it work you'll have to be specific about the types:

    import org.apache.spark.ml.classification.LogisticRegressionModel
    
    ovrModel.models.zipWithIndex.foreach { 
      case (model: LogisticRegressionModel, i: Int) => 
        model.save(s"model-${model.uid}-$i")
    }
    

    or to be more generic:

    import org.apache.spark.ml.util.MLWritable
    
    ovrModel.models.zipWithIndex.foreach { 
      case (model: MLWritable, i: Int) =>
        model.save(s"model-${model.uid}-$i")
    }
    

    Unfortunately as for now (Spark 1.6) OneVsRestModel doesn't implement MLWritable so it cannot be saved alone.

    Note:

    All models int the OneVsRest seem to use the same uid hence we need an explicit index. It will be also useful to identify the model later.

    0 讨论(0)
提交回复
热议问题