Why vector normalization can improve the accuracy of clustering and classification?

后端 未结 2 1871
误落风尘
误落风尘 2020-12-07 14:59

It is described in Mahout in Action that normalization can slightly improve the accuracy. Can anyone explain the reason, thanks!

2条回答
  •  渐次进展
    2020-12-07 15:35

    the reason behind it is that sometimes the measurements of the different variables are different in nature so the variance of the results is adjusted by normalizing. for instance in an age(x) vs weight (y) comparison for a set of children, the age can go from one to 10 and the weight can go from 10 pounds to 100. if you dont normalize the graphic will produce a two very weird long oval shapes to the right of your graph since both scales need to go form one to 100. normalizing would give both axis a 1 to 100 scale hecnce the graphic will show more meaningful clusters.

提交回复
热议问题