Difference of prediction results in random forest model

前端未结

关注

 1  1798

I have built an Random Forest model and I got two different prediction results when I wrote two different lines of code in order to generate the prediction. I wonder which o

相关标签:

1条回答

独厮守ぢ

2021-01-07 11:45
The difference is in the two calls to predict:
```
predict(model)
```
and
```
predict(model, newdata=dat)
```
The first option gets the out-of-bag predictions on your training data from the random forest. This is generally what you want, when comparing predicted values to actuals.

The second treats your training data as if it was a new dataset, and runs the observations down each tree. This will result in an artificially close correlation between the predictions and the actuals, since the RF algorithm generally doesn't prune the individual trees, relying instead on the ensemble of trees to control overfitting. So don't do this if you want to get predictions on the training data.
0 讨论(0)
发布评论:

提交评论
- 加载中...