scikit-learn | 易学教程

Cannot understand with sklearn's PolynomialFeatures

阅读更多关于 Cannot understand with sklearn's PolynomialFeatures

问题 Need help in sklearn's Polynomial Features. It works quite well with one feature but whenever I add multiple features, it also outputs some values in the array besides the values raised to the power of the degrees. For ex: For this array, X=np.array([[230.1,37.8,69.2]]) when I try to X_poly=poly.fit_transform(X) It outputs [[ 1.00000000e+00 2.30100000e+02 3.78000000e+01 6.92000000e+01 5.29460100e+04 8.69778000e+03 1.59229200e+04 1.42884000e+03 2.61576000e+03 4.78864000e+03]] Here, what is 8

How to compute precision,recall and f1 score of an imbalanced dataset for K fold cross validation with 10 folds in python

阅读更多关于 How to compute precision,recall and f1 score of an imbalanced dataset for K fold cross validation with 10 folds in python

问题 I have an imbalanced dataset containing binary classification problem.I have built Random Forest Classifier and used k fold cross validation with 10 folds. kfold = model_selection.KFold(n_splits=10, random_state=42) model=RandomForestClassifier(n_estimators=50) I got the results of the 10 folds results = model_selection.cross_val_score(model,features,labels, cv=kfold) print results [ 0.60666667 0.60333333 0.52333333 0.73 0.75333333 0.72 0.7 0.73 0.83666667 0.88666667] I have calculated

How to compute precision,recall and f1 score of an imbalanced dataset for K fold cross validation with 10 folds in python

阅读更多关于 How to compute precision,recall and f1 score of an imbalanced dataset for K fold cross validation with 10 folds in python

How to compute precision,recall and f1 score of an imbalanced dataset for K fold cross validation with 10 folds in python

阅读更多关于 How to compute precision,recall and f1 score of an imbalanced dataset for K fold cross validation with 10 folds in python

GridSearch over MultiOutputRegressor?

阅读更多关于 GridSearch over MultiOutputRegressor?

问题 Let's consider a multivariate regression problem (2 response variables: Latitude and Longitude). Currently, a few machine learning model implementations like Support Vector Regression sklearn.svm.SVR do not currently provide naive support of multivariate regression. For this reason, sklearn.multioutput.MultiOutputRegressor can be used. Example: from sklearn.multioutput import MultiOutputRegressor svr_multi = MultiOutputRegressor(SVR(),n_jobs=-1) #Fit the algorithm on the data svr_multi.fit(X

When scale the data, why the train dataset use 'fit' and 'transform', but the test dataset only use 'transform'?

阅读更多关于 When scale the data, why the train dataset use 'fit' and 'transform', but the test dataset only use 'transform'?

问题 When scale the data, why the train dataset use 'fit' and 'transform', but the test dataset only use 'transform'? SAMPLE_COUNT = 5000 TEST_COUNT = 20000 seed(0) sample = list() test_sample = list() for index, line in enumerate(open('covtype.data','rb')): if index < SAMPLE_COUNT: sample.append(line) else: r = randint(0,index) if r < SAMPLE_COUNT: sample[r] = line else: k = randint(0,index) if k < TEST_COUNT: if len(test_sample) < TEST_COUNT: test_sample.append(line) else: test_sample[k] = line

Visualise word2vec generated from gensim

阅读更多关于 Visualise word2vec generated from gensim

问题 I have trained a doc2vec and corresponding word2vec on my own corpus using gensim. I want to visualise the word2vec using t-sne with the words. As in, each dot in the figure has the "word" also with it. I looked at a similar question here : t-sne on word2vec Following it, I have this code : import gensim import gensim.models as g from sklearn.manifold import TSNE import re import matplotlib.pyplot as plt modelPath="/Users/tarun/Desktop/PE/doc2vec/model3_100_newCorpus60_1min_6window

Visualise word2vec generated from gensim

阅读更多关于 Visualise word2vec generated from gensim

Python scikit learn MLPClassifier “hidden_layer_sizes”

阅读更多关于 Python scikit learn MLPClassifier “hidden_layer_sizes”

问题 I am lost in the scikit learn 0.18 user manual (http://scikit-learn.org/dev/modules/generated/sklearn.neural_network.MLPClassifier.html#sklearn.neural_network.MLPClassifier): hidden_layer_sizes : tuple, length = n_layers - 2, default (100,) The ith element represents the number of neurons in the ith hidden layer. If I am looking for only 1 hidden layer and 7 hidden units in my model, should I put like this? Thanks! hidden_layer_sizes=(7, 1) 回答1: hidden_layer_sizes=(7,) if you want only 1

Python scikit learn MLPClassifier “hidden_layer_sizes”

阅读更多关于 Python scikit learn MLPClassifier “hidden_layer_sizes”