问题
Before building a model I make scaling like this
X = StandardScaler(with_mean = 0, with_std = 1).fit_transform(X)
and after build a features importance plot
xgb.plot_importance(bst, color='red')
plt.title('importance', fontsize = 20)
plt.yticks(fontsize = 10)
plt.ylabel('features', fontsize = 20)
The problem is that instead of feature's names we get f0, f1, f2, f3 etc..... How to return feature's names?
thanks
回答1:
first we get list of feature names before preprocessing
dtrain = xgb.DMatrix( X, label=y)
dtrain.feature_names
Then
bst.get_fscore()
mapper = {'f{0}'.format(i): v for i, v in enumerate(dtrain.feature_names)}
mapped = {mapper[k]: v for k, v in bst.get_fscore().items()}
mapped
xgb.plot_importance(mapped, color='red')
that's all
回答2:
You can retrieve the importance of Xgboost model (trained with scikit-learn
like API) with:
xgb.feature_importances_
To check what type of importance it is: xgb.importance_type
. The importance type can be set in the Xgboost constructor. You can read about ways to compute feature importance in Xgboost in this post.
回答3:
For xgboost 0.82, the answer is quite simple, just overwrite the feature names attribute with the list of feature name strings.
trained_xgbmodel.feature_names = feature_name_list
xgboost.plot_importance(trained_xgbmodel)
来源:https://stackoverflow.com/questions/38600813/names-features-importance-plot-after-preprocessing