问题
How can I pass Categorical and numeric features to DecisionTreeRegressor in sklearn? below code shows how to use the code in general for numeric features:
make_tree = tree.DecisionTreeRegressor()
fit_tree = make_tree.fit(X_train, y_train)
回答1:
First, all categorical features should be encoded (represented by numbers) to be interpretable for the regression models. To do so, you can use, LabelEncoder followed by OneHotEncoder. In the case of high-cardinal features, you can use FeatureHasher.
As an example:
from sklearn.feature_extraction import FeatureHasher
# n_feature: number of unique values in the feature(s)
# input_type should be passed as 'string' to be compatible to pandas DataFrames
feature_hasher = FeatureHasher(n_features=5000, input_type='string')
df['COLUMN_NAME'] = feature_hasher.transform(df['COLUMN_NAME'])
Then, you can pass your features to the regressor.
来源:https://stackoverflow.com/questions/50191729/how-to-pass-mixed-categorical-and-numeric-features-to-decision-tree-regressor