Feature Selection and Reduction for Text Classification

前端 未结 5 574
北恋
北恋 2020-12-07 07:29

I am currently working on a project, a simple sentiment analyzer such that there will be 2 and 3 classes in separate cases

5条回答
  •  夕颜
    夕颜 (楼主)
    2020-12-07 08:23

    I would recommend dimensionality reduction instead of feature selection. Consider either singular value decomposition, principal component analysis, or even better considering it's tailored for bag-of-words representations, Latent Dirichlet Allocation. This will allow you to notionally retain representations that include all words, but to collapse them to fewer dimensions by exploiting similarity (or even synonymy-type) relations between them.

    All these methods have fairly standard implementations that you can get access to and run---if you let us know which language you're using, I or someone else will be able to point you in the right direction.

提交回复
热议问题