R- Random forest predict fails with NAs in predictors

狂风中的少年 提交于 2019-12-11 07:16:03

问题


The documentation (If I'm reading it correctly) says that the random forest predict function produces NA predictions if it encounters NA predictors for certain observations.

NOTE: If the object inherits from randomForest.formula, then any data with NA are silently omitted from the prediction. The returned value will contain NA correspondingly in the aggregated and individual tree predictions (if requested), but not in the proximity or node matrices

However, if I try to use the predict function on a dataset with some NA's in predictors [NA's in 7 observations out of 2688] I encounter the following error condition, and prediction fails.

Error in predict.randomForest(model, new.ds) : missing values in newdata

There is a slightly messy work-around that I would like to avoid if possible.

Am I doing/reading something wrong? Does it have to do something with the "inherits from randomForest.formula" clause?


回答1:


Using some examples from the documentation:

set.seed(1)
x <- data.frame(x1=gl(32, 5), x2=runif(160), y=rnorm(160))
rf1 <- randomForest(x[-3], x[[3]], ntree=10)
> inherits(rf1,"randomForest.formula")
[1] FALSE

> iris.rf <- randomForest(Species ~ ., data=iris, importance=TRUE,
                         proximity=TRUE)
> inherits(iris.rf,"randomForest.formula")
[1] TRUE

So you probably called randomForest without using the formula interface to fit your model.



来源:https://stackoverflow.com/questions/21559600/r-random-forest-predict-fails-with-nas-in-predictors

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!