Information Gain Calculation for a text file?

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-02 06:57:31

I found my answer. In this we have to generate arff file.

In .arff file

@RELATION section will contain all words present in your whole document after preprocessing .Each word will be of type real because tfidf value is a real value.

@data section will contain their tfidf value calculated during preprocessing. for example first will contain tfidf value all words present in first document an at last colunm the document categary.

@RELATION filename
@ATTRIBUTE word1 real
@ATTRIBUTE word2 real
@ATTRIBUTE word3 real
.
.
.
.so on
@ATTRIBUTE class {cacm,cisi,cran,med}

@data
0.5545479562,0.27,0.554544479562,0.4479562,cacm
0.5545479562,0.27,0.554544479562,0.4479562,cacm
0.55454479562,0.1619617,0.579562,0.5542,cisi
0.5545479562,0.27,0.554544479562,0.4479562,cisi
0.0,0.2396113617,0.44479562,0.2,cran
0.5545479562,0.27,0.554544479562,0.4479562,carn
0.5545177444479562,0.26196113617,0.0,0.0,med
0.5545479562,0.27,0.554544479562,0.4479562,med

after you generate this file you can give this file as input to InfoGainAttributeEval.java. and this working for me.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!