Weka J48 Classifier: Cannot handle numeric class?

人盡茶涼 提交于 2019-11-29 08:57:41
gerard

The J48 classifier is a tree classifier which only accept nominal classes. Meaning that the classes according to which you will classify your instances must be known before hand. IE, if you are trying to predict a rating and you know that the rating is on a 5-level Likert scale you have to explicitly say so in your ARFF file with something like @attribute class {1,2,3,4,5}, but if you to predict the weight of a person then this value is probably a real number and therefore cannot 'fit' in a tree classification. NB: one way to go around that would be to create a sampling of the weights available: from 10 to 15 kg, from 15 to 20 kg etc. This way you could have a nominal class attribute.

Alasdair

The word vectors could be converted to binary like this:

java -Xmx4G -cp /weka/weka.jar weka.filters.unsupervised.attribute.NumericToBinary -i /home/test/cats-vector.arff -o /home/test/cats-binary.arff

Although this adds bias to the kind of data you are training against. This implies that binary strings very close to one-another are treated as more similar to strings far away. If you want to erase this bias and regard each string as a totally unique entity then use @attribute class {ABC, DEF, GHI, etc} Then it works!

If you really want to communicate that these features are important and not-at-all related, make a whole column for each string, where it has the value '1' for when a row has that category, and 0 when it does not. This creates very sparse data, but then the learning algorithm has a bias to scan that data for information gain.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!