Process for oversampling data for imbalanced binary classification

后端 未结 2 753
余生分开走
余生分开走 2020-12-20 03:43

I have about a 30% and 70% for class 0 (minority class) and class 1 (majority class). Since I do not have a lot of data, I am planning to oversample the minority class to ba

2条回答
  •  半阙折子戏
    2020-12-20 03:59

    From my experience this is a bad practice. As you mentioned test data should contain unseen samples so it would not overfit and give you better evaluation of training process. If you need to increase sample sizes - think about data transformation possibilities. E.g. human/cat image classification, as they are symmetric you can double sample size by mirroring images.

提交回复
热议问题