How can I install XGBoost package in python on Windows

后端 未结 9 939
庸人自扰
庸人自扰 2020-12-16 15:32

I tried to install XGBoost package in python. I am using windows os, 64bits . I have gone through following.

The package directory states that xgboost is unstable fo

9条回答
  •  谎友^
    谎友^ (楼主)
    2020-12-16 15:35

    I have installed xgboost in windows os following the above resources, which is not available till now in pip. However, I tried with the following function code, to get cv parameters tuned:

    #Import libraries:
    import pandas as pd
    import numpy as np
    import xgboost as xgb
    from xgboost.sklearn import XGBClassifier
    from sklearn import cross_validation, metrics   #Additional sklearn functions
    from sklearn.grid_search import GridSearchCV   #Perforing grid search
    
    import matplotlib.pylab as plt
    %matplotlib inline
    from matplotlib.pylab import rcParams
    rcParams['figure.figsize'] = 12, 4
    
    train = pd.read_csv('train_data.csv')
    target = 'target_value'
    IDcol = 'ID'
    

    A function is created to get the optimum parameters and display the output in visual form.

    def modelfit(alg, dtrain, predictors,useTrainCV=True, cv_folds=5, early_stopping_rounds=50):
    
    if useTrainCV:
        xgb_param = alg.get_xgb_params()
        xgtrain = xgb.DMatrix(dtrain[predictors].values, label=dtrain[target].values)
        cvresult = xgb.cv(xgb_param, xgtrain, num_boost_round=alg.get_params()['n_estimators'], nfold=cv_folds,
            metrics='auc', early_stopping_rounds=early_stopping_rounds, show_progress=False)
        alg.set_params(n_estimators=cvresult.shape[0])
    
    #Fit the algorithm on the data
    alg.fit(dtrain[predictors], dtrain[target_label],eval_metric='auc')
    
    #Predict training set:
    dtrain_predictions = alg.predict(dtrain[predictors])
    dtrain_predprob = alg.predict_proba(dtrain[predictors])[:,1]
    
    #Print model report:
    print "\nModel Report"
    print "Accuracy : %.4g" % metrics.accuracy_score(dtrain[target_label].values, dtrain_predictions)
    print "AUC Score (Train): %f" % metrics.roc_auc_score(dtrain[target_label], dtrain_predprob)
    
    feat_imp = pd.Series(alg.booster().get_fscore()).sort_values(ascending=False)
    feat_imp.plot(kind='bar', title='Feature Importances')
    plt.ylabel('Feature Importance Score')
    

    Now, when the function is called to get the optimum parameters:

      #Choose all predictors except target & IDcols
      predictors = [x for x in train.columns if x not in [target]]
      xgb = XGBClassifier(
      learning_rate =0.1,
      n_estimators=1000,
      max_depth=5,
      min_child_weight=1,
      gamma=0,
      subsample=0.7,
      colsample_bytree=0.7,
      objective= 'binary:logistic',
      nthread=4,
      scale_pos_weight=1,
      seed=198)
     modelfit(xgb, train, predictors)
    

    Although the feature importance chart is displayed, but the parameters info in red box at the top of chart is missing: Consulted people who use linux/mac OS and got xgboost installed. They are getting the above info. I was wondering whether it is due to specific implementation , I build and installed in windows. And how I can get the parameters info displayed above the chart. As of now, I am getting the chart and not the red box and info within it. Thanks.

提交回复
热议问题