pipeline

Using multiple custom classes with Pipeline sklearn (Python)

元气小坏坏 提交于 2021-02-18 11:43:27
问题 I try to do a tutorial on Pipeline for students but I block. I'm not an expert but I'm trying to improve. So thank you for your indulgence. In fact, I try in a pipeline to execute several steps in preparing a dataframe for a classifier: Step 1: Description of the dataframe Step 2: Fill NaN Values Step 3: Transforming Categorical Values into Numbers Here is my code: class Descr_df(object): def transform (self, X): print ("Structure of the data: \n {}".format(X.head(5))) print ("Features names:

Using multiple custom classes with Pipeline sklearn (Python)

落爺英雄遲暮 提交于 2021-02-18 11:43:10
问题 I try to do a tutorial on Pipeline for students but I block. I'm not an expert but I'm trying to improve. So thank you for your indulgence. In fact, I try in a pipeline to execute several steps in preparing a dataframe for a classifier: Step 1: Description of the dataframe Step 2: Fill NaN Values Step 3: Transforming Categorical Values into Numbers Here is my code: class Descr_df(object): def transform (self, X): print ("Structure of the data: \n {}".format(X.head(5))) print ("Features names:

Can I set a timeout and number of retries on a specific pipeline request?

为君一笑 提交于 2021-02-18 10:34:41
问题 When using spray's pipelining to make an HTTP request like this: val urlpipeline = sendReceive ~> unmarshal[String] urlpipeline { Get(url) } is there a way to specify a timeout for the request and the number of times it should retry for that specific request? All the documentation I've found only references doing in a config (and even then I can't seem to get it to work). thx 回答1: With the configuration file I use Spray 1.2.0 in an Akka system. Inside my actor, I import the existing Akka

Kubeflow pipeline doesnt create any pod; unknown status

懵懂的女人 提交于 2021-02-11 14:40:56
问题 I started working with kubeflow and created a first, little pipeline. Unfortunately it doesn't work, so when I try to create a run with my pipeline nothing happens. Neither it creates a Kubernetes pod nor does the status of the run change (it keeps saying "Unknown status"). I also cant see the belonging graph or run output. The code of my pipeline looks like this: import kfp from kfp import components from kfp import dsl from kfp import onprem import sys def train_op( epochs, validations,

Kubeflow pipeline doesnt create any pod; unknown status

跟風遠走 提交于 2021-02-11 14:35:35
问题 I started working with kubeflow and created a first, little pipeline. Unfortunately it doesn't work, so when I try to create a run with my pipeline nothing happens. Neither it creates a Kubernetes pod nor does the status of the run change (it keeps saying "Unknown status"). I also cant see the belonging graph or run output. The code of my pipeline looks like this: import kfp from kfp import components from kfp import dsl from kfp import onprem import sys def train_op( epochs, validations,

AttributeError: 'numpy.ndarray' object has no attribute 'id'

大憨熊 提交于 2021-02-11 13:42:08
问题 I am creating a sklearn pipeline that consists of 3 steps: Transforms pandas dataframe into 3D array Transforms 3D array into recurrence plot (image) Trains an image classification model using Keras This is my initial data set: train_df - pandas dataframe id cycle s1 1 1 0.05 1 2 0.04 1 3 0.05 1 4 0.05 2 1 0.02 2 2 0.03 y_train array([[1., 0., 0.], [1., 0., 0.], ... [1., 0., 0.]], dtype=float32) When I run my current code (see below), I get the following error: AttributeError: 'numpy.ndarray'

GridSearchCV on a working pipeline returns ValueError

ぐ巨炮叔叔 提交于 2021-02-10 15:16:04
问题 I am using GridSearchCV in order to find the best parameters for my pipeline. My pipeline seems to work well as I can apply: pipeline.fit(X_train, y_train) preds = pipeline.predict(X_test) And I get a decent result. But GridSearchCV obviously doesn't like something, and I cannot figure it out. My pipeline: feats = FeatureUnion([('age', age), ('education_num', education_num), ('is_education_favo', is_education_favo), ('is_marital_status_favo', is_marital_status_favo), ('hours_per_week', hours

GridSearchCV on a working pipeline returns ValueError

主宰稳场 提交于 2021-02-10 15:15:23
问题 I am using GridSearchCV in order to find the best parameters for my pipeline. My pipeline seems to work well as I can apply: pipeline.fit(X_train, y_train) preds = pipeline.predict(X_test) And I get a decent result. But GridSearchCV obviously doesn't like something, and I cannot figure it out. My pipeline: feats = FeatureUnion([('age', age), ('education_num', education_num), ('is_education_favo', is_education_favo), ('is_marital_status_favo', is_marital_status_favo), ('hours_per_week', hours

sklearn - How to retrieve PCA components and explained variance from inside a Pipeline passed to GridSearchCV

我们两清 提交于 2021-02-08 06:51:35
问题 I am using GridSearchCV with a pipeline as follows: grid = GridSearchCV( Pipeline([ ('reduce_dim', PCA()), ('classify', RandomForestClassifier(n_jobs = -1)) ]), param_grid=[ { 'reduce_dim__n_components': range(0.7,0.9,0.1), 'classify__n_estimators': range(10,50,5), 'classify__max_features': ['auto', 0.2], 'classify__min_samples_leaf': [40,50,60], 'classify__criterion': ['gini', 'entropy'] } ], cv=5, scoring='f1') grid.fit(X,y) How do I now retrieve PCA details like components and explained

sklearn - How to retrieve PCA components and explained variance from inside a Pipeline passed to GridSearchCV

空扰寡人 提交于 2021-02-08 06:48:50
问题 I am using GridSearchCV with a pipeline as follows: grid = GridSearchCV( Pipeline([ ('reduce_dim', PCA()), ('classify', RandomForestClassifier(n_jobs = -1)) ]), param_grid=[ { 'reduce_dim__n_components': range(0.7,0.9,0.1), 'classify__n_estimators': range(10,50,5), 'classify__max_features': ['auto', 0.2], 'classify__min_samples_leaf': [40,50,60], 'classify__criterion': ['gini', 'entropy'] } ], cv=5, scoring='f1') grid.fit(X,y) How do I now retrieve PCA details like components and explained