问题
I am performing my analysis using R, I will be implementing four algorithms.
1. RF
2. Log Reg
3. SVM
4. LDA
I have 50 predictors and 1 target variable. All my predictors and target variable are only binary numbers 0s and 1s.
I have the following questions:
Should I convert them all into factors?
Converting them into factors, and applying RF algorithms give 100% accuracy, I am very much surprised to see that as well.
Also, for other algorithms, how should i treat my variables priorly, before feeding them into my other algorithms.
Thanks
回答1:
If you variables / predictors are categorical, then it is best to convert them to factors. Otherwise, it is likely they will be treated as numerical values.
If you are doing a classification task, then best to have the target / response variable as a factor as well.
It is also better to look at the documentation of the functions you use to make sure they will not convert factors to numerical values.
回答2:
Use adaboost...
Take a look at different kaggle kernels, especially the Mercedes one, to get the idea of implementing adaboost.
https://www.kaggle.com/c/mercedes-benz-greener-manufacturing/kernels
The dataset is mixed of both numerical and factors and 0s,1s.
来源:https://stackoverflow.com/questions/46844180/all-binary-predictors-in-a-classification-task