“Parallel” pipeline to get best model using gridsearch
In sklearn, a serial pipeline can be defined to get the best combination of hyperparameters for all consecutive parts of the pipeline. A serial pipeline can be implemented as follows: from sklearn.svm import SVC from sklearn import decomposition, datasets from sklearn.pipeline import Pipeline from sklearn.model_selection import GridSearchCV digits = datasets.load_digits() X_train = digits.data y_train = digits.target #Use Principal Component Analysis to reduce dimensionality # and improve generalization pca = decomposition.PCA() # Use a linear SVC svm = SVC() # Combine PCA and SVC to a