Oversampling or SMOTE in Pyspark

前端 未结 2 573
野趣味
野趣味 2021-01-13 00:10

I have 7 classes and the total number of records are 115 and I wanted to run Random Forest model over this data. But as the data is not enough to get a high accuracy. So i w

2条回答
  •  感动是毒
    2021-01-13 00:50

    Maybe this project can be useful for your goal: Spark SMOTE

    But I think that 115 records aren't enough for a random forest. You can use other simplest technique like decision trees

    You can check this answer:

    Is Random Forest suitable for very small data sets?

提交回复
热议问题