mleap

Export spark feature transformation pipeline to a file

一世执手 提交于 2021-02-09 07:30:30
问题 PMML, Mleap, PFA currently only support row based transformations. None of them support frame based transformations like aggregates or groupby or join. What is the recommended way to export a spark pipeline consisting of these operations. 回答1: I see 2 options wrt Mleap: 1) implement dataframe based transformers and the SQLTransformer -Mleap equivalent. This solution seems to be conceptually the best (since you can always encapsule such transformations in a pipeline element) but also alot of

Export spark feature transformation pipeline to a file

心已入冬 提交于 2021-02-09 07:18:38
问题 PMML, Mleap, PFA currently only support row based transformations. None of them support frame based transformations like aggregates or groupby or join. What is the recommended way to export a spark pipeline consisting of these operations. 回答1: I see 2 options wrt Mleap: 1) implement dataframe based transformers and the SQLTransformer -Mleap equivalent. This solution seems to be conceptually the best (since you can always encapsule such transformations in a pipeline element) but also alot of

Export spark feature transformation pipeline to a file

江枫思渺然 提交于 2021-02-09 07:14:43
问题 PMML, Mleap, PFA currently only support row based transformations. None of them support frame based transformations like aggregates or groupby or join. What is the recommended way to export a spark pipeline consisting of these operations. 回答1: I see 2 options wrt Mleap: 1) implement dataframe based transformers and the SQLTransformer -Mleap equivalent. This solution seems to be conceptually the best (since you can always encapsule such transformations in a pipeline element) but also alot of

Unable to serialize logistic regressing in mleap

拈花ヽ惹草 提交于 2019-12-12 04:49:19
问题 java.lang.AssertionError: assertion failed: This op only supports binary logistic regression I am trying to serialize a spark pipeline in mleap. I am using Tokenizer, HashingTF and LogisticRegression in my pipeline. When I am trying to serialize my pipeline I get the above error. Here is the code I am using to serialize the pipeline - val pipeline = Pipeline(pipelineConfig) val model = pipeline.fit(data) (for(bf <- managed(BundleFile("jar:file:/tmp/abc.model.twitter.zip"))) yield { model