scikit-learn | 易学教程

Receiving KeyError: “None of [Int64Index([ … dtype='int64', length=1323)] are in the [columns]”

阅读更多关于 Receiving KeyError: “None of [Int64Index([ … dtype='int64', length=1323)] are in the [columns]”

问题 SUMMARY When feeding test and train data into a ROC curve plot, I receive the following error: KeyError: "None of [Int64Index([ 0, 1, 2, ... dtype='int64', length=1323)] are in the [columns]" The error seems to be saying that it doesn't like the format of my data, but it worked when run the first time and I haven't been able to get it to run again. Am I incorrectly splitting my data or sending incorrectly formatted data into my function? WHAT I'VE TRIED Read through several StackOverflow

Working of labelEncoder in sklearn

阅读更多关于 Working of labelEncoder in sklearn

问题 Say I have the following input feature: hotel_id = [1, 2, 3, 2, 3] This is a categorical feature with numeric values. If I give it to the model as it is, the model will treat it as continuous variable, ie., 2 > 1. If I apply sklearn.labelEncoder() then I will get: hotel_id = [0, 1, 2, 1, 2] So this encoded feature is considered as continuous or categorical? If it is treated as continuous then whats the use of labelEncoder(). P.S. I know about one hot encoding. But there are around 100 hotel

Working of labelEncoder in sklearn

阅读更多关于 Working of labelEncoder in sklearn

Working of labelEncoder in sklearn

阅读更多关于 Working of labelEncoder in sklearn

Apply StandardScaler to parts of a data set

阅读更多关于 Apply StandardScaler to parts of a data set

问题 I want to use sklearn 's StandardScaler . Is it possible to apply it to some feature columns but not others? For instance, say my data is: data = pd.DataFrame({'Name' : [3, 4,6], 'Age' : [18, 92,98], 'Weight' : [68, 59,49]}) Age Name Weight 0 18 3 68 1 92 4 59 2 98 6 49 col_names = ['Name', 'Age', 'Weight'] features = data[col_names] I fit and transform the data scaler = StandardScaler().fit(features.values) features = scaler.transform(features.values) scaled_features = pd.DataFrame(features,

Apply StandardScaler to parts of a data set

阅读更多关于 Apply StandardScaler to parts of a data set

Apply StandardScaler to parts of a data set

阅读更多关于 Apply StandardScaler to parts of a data set

Apply StandardScaler to parts of a data set

阅读更多关于 Apply StandardScaler to parts of a data set

Cannot understand with sklearn's PolynomialFeatures

阅读更多关于 Cannot understand with sklearn's PolynomialFeatures

问题 Need help in sklearn's Polynomial Features. It works quite well with one feature but whenever I add multiple features, it also outputs some values in the array besides the values raised to the power of the degrees. For ex: For this array, X=np.array([[230.1,37.8,69.2]]) when I try to X_poly=poly.fit_transform(X) It outputs [[ 1.00000000e+00 2.30100000e+02 3.78000000e+01 6.92000000e+01 5.29460100e+04 8.69778000e+03 1.59229200e+04 1.42884000e+03 2.61576000e+03 4.78864000e+03]] Here, what is 8

Cannot understand with sklearn's PolynomialFeatures

阅读更多关于 Cannot understand with sklearn's PolynomialFeatures