ValueError: feature_names mismatch: in xgboost in the predict() function

后端 未结 8 1832
悲哀的现实
悲哀的现实 2020-12-25 13:15

I have trained an XGBoostRegressor model. When I have to use this trained model for predicting for a new input, the predict() function throws a feature_names mismatch error,

8条回答
  •  再見小時候
    2020-12-25 13:43

    I'm contributing an answer as I experienced this problem when putting a fitted XGBRegressor model into production. Thus, this is a solution for cases where you cannot select column names from a y training or testing DataFrame, though there may be cross-over which could be helpful.

    The model had been fit on a Pandas DataFrame, and I was attempting to pass a single row of values as a np.array to the predict function. Processing the values of the array had already been performed (reverse label encoded, etc.), and the array was all numeric values.

    I got the familiar error:

    ValueError: feature_names mismatch followed by a list of the features, followed by a list of the same length: ['f0', 'f1' ....]

    While there are no doubt more direct solutions, I had little time and this fixed the problem:

    1. Make the input vector a Pandas Dataframe:
    series = {'feature1': [value],
              'feature2': [value],
              'feature3': [value],
              'feature4': [value],
              'feature5': [value],
              'feature6': [value],
              'feature7': [value],
              'feature8': [value],
              'feature9': [value],
              'feature10': [value]
               }
    
    self.vector = pd.DataFrame(series)
    
    1. Get the feature names that the trained model knows:

    names = model.get_booster().feature_names

    1. Select those feature from the input vector DataFrame (defined above), and perform iloc indexing:

    result = model.predict(vector[names].iloc[[-1]])


    The iloc transformation I found here.

    Selecting the feature names – as models in the Scikit Learn implementation do not have a feature_names attribute – using get_booster( ).feature_names I found in @Athar post above.

    Check out the the documentation to learn more.

    Hope this helps.

提交回复
热议问题