random-forest | 易学教程

Got continuous is not supported error in RandomForestRegressor

阅读更多关于 Got continuous is not supported error in RandomForestRegressor

问题 I'm just trying to do a simple RandomForestRegressor example. But while testing the accuracy I get this error /Users/noppanit/anaconda/lib/python2.7/site-packages/sklearn/metrics/classification.pyc in accuracy_score(y_true, y_pred, normalize, sample_weight) 177 178 # Compute accuracy for each possible representation --> 179 y_type, y_true, y_pred = _check_targets(y_true, y_pred) 180 if y_type.startswith('multilabel'): 181 differing_labels = count_nonzero(y_true - y_pred, axis=1) /Users

Got continuous is not supported error in RandomForestRegressor

阅读更多关于 Got continuous is not supported error in RandomForestRegressor

Python NLP - ValueError: could not convert string to float: 'UKN'

阅读更多关于 Python NLP - ValueError: could not convert string to float: 'UKN'

问题 I'm trying to train a random forest regressor to predict the hourly wage of an employee given the job description supplied. Note, I've signed an NDA and cannot upload real data. The below "observation" is synthetic: sample_row = {'job_posting_id': 'id_01', 'buyer_vertical': 'Business Services', 'currency': 'USD', 'fg_onet_code': '43-9011.00', 'jp_title': 'Computer Operator', 'jp_description': "Performs information security-related risk and compliance activities, including but not limited to

Python NLP - ValueError: could not convert string to float: 'UKN'

阅读更多关于 Python NLP - ValueError: could not convert string to float: 'UKN'

How to calculate class weights for Random forests

阅读更多关于 How to calculate class weights for Random forests

问题 I have datasets for 2 classes on which I have to perform binary classification. I chose Random forest as a classifier as it is giving me the best accuracy among other models. Number of datapoints in dataset-1 is 462 and dataset-2 contains 735 datapoints. I have noticed that my data has minor class imbalance so I tried to optimise my training model and retrained my model by providing class weights. I provided following value of class weights. cwt <- c(0.385,0.614) # Class weights ss <- c(300

Variable importance with ranger

阅读更多关于 Variable importance with ranger

问题 I trained a random forest using caret + ranger . fit <- train( y ~ x1 + x2 ,data = total_set ,method = "ranger" ,trControl = trainControl(method="cv", number = 5, allowParallel = TRUE, verbose = TRUE) ,tuneGrid = expand.grid(mtry = c(4,5,6)) ,importance = 'impurity' ) Now I'd like to see the importance of variables. However, none of these work : > importance(fit) Error in UseMethod("importance") : no applicable method for 'importance' applied to an object of class "c('train', 'train.formula')

Variable importance with ranger

阅读更多关于 Variable importance with ranger

Scala: how to know which probability correspond to which class?

阅读更多关于 Scala: how to know which probability correspond to which class?

问题 I create a classifier random forest to predict something. The label is either "yes" (=1.0) or "no" (=0.0) I apply my model on a test. Here is my code and my result for 20 lines: import org.apache.spark.ml.tuning.CrossValidatorModel import org.apache.spark.sql.types._ import org.apache.spark.sql._ import org.apache.spark.sql.functions.udf import org.apache.spark.sql.functions._ var modelrf = CrossValidatorModel.load("modelSupervise/newModel") var test = spark.sql("""select * from dc.newTest"""

Forecasting future occurrences with Random Forest

阅读更多关于 Forecasting future occurrences with Random Forest

问题 I'm currently exploring the use of Random Forests to predict future values of occurrences (my ARIMA model gave me really bad forecasting so I'm trying to evaluate other options). I'm fully aware that the bad results might be due to the fact that I don't have a lot of data and the quality isn't the greatest. My initial data consisted simply of the number of occurrences per date. I then added separate columns representing the day, month, year, day of the week (which was later one-hot encoded)

R randomForest - how to predict with a “getTree” tree

阅读更多关于 R randomForest - how to predict with a “getTree” tree

问题 Background: I can make a random Forest in R: set.seed(1) library(randomForest) data(iris) model.rf <- randomForest(Species ~ ., data=iris, importance=TRUE, ntree=20, mtry = 2) I can predict values using the randomForest object that I just made: my_pred <- predict(model.rf) plot(iris$Species,my_pred) I can then peel off some random tree from the forest: idx <- sample(x = 1:20,size = 1,replace = F) single_tree <- getTree(model.rf,k=1) Questions: How do I predict from a single tree pulled from