How to calculate the threshold value for numeric attributes in Quinlan's C4.5 algorithm?
I am trying to find how the C4.5 algorithm determines the threshold value for numeric attributes. I have researched and can not understand, in most places I've found this information: The training samples are first sorted on the values of the attribute Y being considered. There are only a finite number of these values, so let us denote them in sorted order as {v1,v2, …,vm}. Any threshold value lying between vi and vi+1 will have the same effect of dividing the cases into those whose value of the attribute Y lies in {v1, v2, …, vi} and those whose value is in {vi+1, vi+2, …, vm}. There are thus