SVM for Text Mining using scikit

狂风中的少年 提交于 2019-12-07 23:31:22

问题


Can someone share a code snippet that shows how to use SVM for text mining using scikit. I have seen an example of SVM on numerical data but not quite sure how to deal with text. I looked at http://scikit-learn.org/stable/auto_examples/document_classification_20newsgroups.html but couldn't find SVM.


回答1:


In text mining problems, text is represented by numeric values. Each feature represent a word and values are binary numbers. That gives a matrix with lots of zeros and a few 1s which means that the corresponding words exist in the text. Words can be given some weights according to their frequency or some other criteria. Then you get some real numbers instead of 0 and 1.

After converting the dataset to numerical values you can use this example: http://scikit-learn.org/dev/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC



来源:https://stackoverflow.com/questions/15818951/svm-for-text-mining-using-scikit

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!