How to calculate the threshold value for numeric attributes in Quinlan's C4.5 algorithm?

后端 未结 2 1544
忘掉有多难
忘掉有多难 2021-01-03 02:31

I am trying to find how the C4.5 algorithm determines the threshold value for numeric attributes. I have researched and can not understand, in most places I\'ve found this i

2条回答
  •  醉酒成梦
    2021-01-03 02:46

    I'm not entirely sure about J48, but assuming its based on C4.5 it would compute the gain for all possible splits (i.e., based on the possible values for the feature). For each split, it computes the information gain and chooses the split with the most information gain. In the case of {70,85,90,95} it would compute the information gain for {70|85,90,95} vs {70,85|90,95} vs {70,85,90|95} and choose the best one.

    Quinlan's book on C4.5 book is a good starting point (https://goo.gl/J2SsPf). See page 25 in particular.

提交回复
热议问题