I am trying to create a surface plot on an external visualization platform. I\'m working with the iris data set that is featured on the sklearn decision tree documentation p
For those interested, I had to recently also implement this for higher dimensional data, code was as follow:
number_of_leaves = (tree.tree_.children_left == -1).sum()
features = x.shape[1]
boundaries = np.zeros([number_of_leaves, features, 2])
boundaries[:,:,0] = -np.inf
boundaries[:,:,1] = np.inf
locs = np.where(tree.tree_.children_left == -1)[0]
for k in range(locs.shape[0]):
idx = locs[k]
idx_new = idx
while idx_new != 0:
i_check = np.where(tree.tree_.children_left == idx_new)[0]
j_check = np.where(tree.tree_.children_right == idx_new)[0]
if i_check.shape[0] == 1:
idx_new = i_check[0]
feat_ = tree.tree_.feature[idx_new]
val_ = tree.tree_.value[idx_new]
boundaries[k,feat_, 0] = val_
elif j_check.shape[0] == 1:
idx_new = j_check[0]
feat_ = tree.tree_.feature[idx_new]
val_ = tree.tree_.value[idx_new]
boundaries[k,feat_, 1] = val_
else:
print('Fail Case') # for debugging only - never occurs
Essentially I build up a n*d*2 tensor where n is the number of leaves of the tree, d is the dimensionality of the space and the third dimension holds the min and max values. Leaves are stored in tree.tree_.children_left / tree.tree_.children_right as -1, I then loop backwards to find the branch that caused the split onto the leaf and add the splitting criteria to the decision bounds.