random-forest | 易学教程

Using python generators in scikit-learn [closed]

阅读更多关于 Using python generators in scikit-learn [closed]

问题 Closed . This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed 5 years ago . I was wondering whether and how it is possible to use a python generator as data input to scikit-learn classifier's .fit() functions? Due to huge amounts of data, this seems to make sense to me. In particular I am about to implement a random forest approach. Regards K 回答1: The answer is "no". To do

R random forest inconsistent predictions

阅读更多关于 R random forest inconsistent predictions

问题 I recently built a random forest model using the ranger package in R. However, I noticed that the predictions stored in the ranger object during training (accessible with model$predictions) do not match the prediction I get if I run the predict command on the same dataset using the model created. The following code reproduces the problem on the mtcars dataset. I created a binary variable just for the sake of converting this to a classification problem though I saw similar results with

Handling categorical features using scikit-learn

阅读更多关于 Handling categorical features using scikit-learn

问题 What am I doing? I am solving a classification problem using Random Forests. I have a set of strings of a fixed length (10 characters long) that represent DNA sequences. DNA alphabet consists of 4 letters, namely A , C , G , T . Here's a sample of my raw data: ATGCTACTGA ACGTACTGAT AGCTATTGTA CGTGACTAGT TGACTATGAT Each DNA sequence comes with experimental data describing a real biological response; the molecule was seen to elicit biological response (1), or not (0). Problem: The training set

something similar to permutation accuracy importance in h2o package

阅读更多关于 something similar to permutation accuracy importance in h2o package

问题 I fitted a random forest for my multinomial target with the randomForest package in R. Looking for the variable importance I found out permutation accuracy importance which is what I was looking for my analysis. I fitted a random forest with the h2o package too, but the only measures it shows me are relative_importance, scaled_importance, percentage . My question is: can I extract a measure that shows me the level of the target which better classify the variable i want to take in exam?

Error in predicting raster with randomForest, Caret, and factor variables

阅读更多关于 Error in predicting raster with randomForest, Caret, and factor variables

问题 I am trying to predict a raster layer with randomForest and the caret package, but fail when I introduce factor variables. Without factors, everything works fine, but as soon as I bring a factor in, I get the error: Error in predict.randomForest(modelFit, newdata) : Type of predictors in new data do not match that of the training data. I have created some sample code below that walks through he process. I present it in a few steps for transparency and to provide a working example. (To skip

How do I replace the bootstrap step in the package randomForest r

阅读更多关于 How do I replace the bootstrap step in the package randomForest r

问题 First some background info, which is probably more interesting on stats.stackexchange: In my data analysis I try to compare the performance of different machine learning methods on time series data (regression, not classification). So for example I have trained a Boosting trained model and compare this with a Random Forest trained model (R package randomForest). I use time series data where the explanatory variables are lagged values of other data and the dependent variable. For some reason

How to install BigMemory and bigrf on windows OS

阅读更多关于 How to install BigMemory and bigrf on windows OS

问题 I have been trying to install bigmemory on my R installation. My OS is windows 7 64 bit and I have tried it on R V2.15.1,2.15.2 and 3.0.1 64 bit but I cant get it to work. I have tried several options download the current source and run the command in R v3.0.1 install.packages("D:/Downloads/bigmemory_4.4.3.tar.gz", repos = NULL, type="source") but this gives an error "ERROR: Unix-only package" download older sources and run a similar commands, in the various installations of R V2 V3 etc, This

caret - random-forests not working: “Something is wrong; all the Accuracy metric values are missing:”

阅读更多关于 caret - random-forests not working: “Something is wrong; all the Accuracy metric values are missing:”

问题 Related to these: getting this error in Caret https://github.com/topepo/caret/issues/160 I'm getting this error: Something is wrong; all the Accuracy metric values are missing: Accuracy Kappa Min. : NA Min. : NA 1st Qu.: NA 1st Qu.: NA Median : NA Median : NA Mean :NaN Mean :NaN 3rd Qu.: NA 3rd Qu.: NA Max. : NA Max. : NA NA's :5 NA's :5 Error in train.default(x, y, weights = w, ...) : Stopping In addition: Warning message: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo

R random forest : data (x) has 0 rows

阅读更多关于 R random forest : data (x) has 0 rows

问题 I am using randomForest function from randomForest package to find the most important variable: my dataframe is called urban and my response variable is revenue which is numeric. urban.random.forest <- randomForest(revenue ~ .,y=urban$revenue, data = urban, ntree=500, keep.forest=FALSE,importance=TRUE,na.action = na.omit) I get the following error: Error in randomForest.default(m, y, ...) : data (x) has 0 rows on the source code it is related to x variable: n <- nrow(x) p <- ncol(x) if (n ==

Use of randomforest() for classification in R?

阅读更多关于 Use of randomforest() for classification in R?

问题 I originally had a data frame composed of 12 columns in N rows. The last column is my class (0 or 1). I had to convert my entire data frame to numeric with training <- sapply(training.temp,as.numeric) But then I thought I needed the class column to be a factor column to use the randomforest() tool as a classifier, so I did training[,"Class"] <- factor(training[,ncol(training)]) I proceed to creating the tree with training_rf <- randomForest(Class ~., data = trainData, importance = TRUE, do