Proximity Matrix in sklearn.ensemble.RandomForestClassifier

后端 未结 3 1783
盖世英雄少女心
盖世英雄少女心 2020-12-28 20:01

I\'m trying to perform clustering in Python using Random Forests. In the R implementation of Random Forests, there is a flag you can set to get the proximity matrix. I can\'

3条回答
  •  温柔的废话
    2020-12-28 20:52

    We don't implement proximity matrix in Scikit-Learn (yet).

    However, this could be done by relying on the apply function provided in our implementation of decision trees. That is, for all pairs of samples in your dataset, iterate over the decision trees in the forest (through forest.estimators_) and count the number of times they fall in the same leaf, i.e., the number of times apply give the same node id for both samples in the pair.

    Hope this helps.

提交回复
热议问题