问题
I have a decision tree classifier from sklearn and I use pydotplus to show it. However I don't really like when there is a lot of informations on each nodes for my presentation (entropy, samples and value).
To explain it easier to people I would like to only keep the decision and the class on it. Where can I modify the code to do it ?
Thank you.
回答1:
Accoring to the documentation, it is not possible to abstain from setting the additional information inside boxes. The only thing that you may implicitly omit is the impurity
parameter.
However, I have done it the other explicit way which is somewhat crooked. First, I save the .dot file setting the impurity to False. Then, I open it up and convert it to a string format. I use regex to subtract the redundant labels and resave it.
The code goes like this:
import pydotplus # pydot library: install it via pip install pydot
from sklearn.tree import DecisionTreeClassifier
from sklearn.tree import export_graphviz
from sklearn.datasets import load_iris
data = load_iris()
clf = DecisionTreeClassifier()
clf.fit(data.data, data.target)
export_graphviz(clf, out_file='tree.dot', impurity=False, class_names=True)
PATH = '/path/to/dotfile/tree.dot'
f = pydot.graph_from_dot_file(PATH).to_string()
f = re.sub('(\\\\nsamples = [0-9]+)(\\\\nvalue = \[[0-9]+, [0-9]+, [0-9]+\])', '', f)
f = re.sub('(samples = [0-9]+)(\\\\nvalue = \[[0-9]+, [0-9]+, [0-9]+\])\\\\n', '', f)
with open('tree_modified.dot', 'w') as file:
file.write(f)
Here are the images before and after modification:
In your case, there seems to be more parameters in boxes, so you may want to tweak the code a little bit.
I hope that helps!
来源:https://stackoverflow.com/questions/44821349/python-graphviz-remove-legend-on-nodes-of-decisiontreeclassifier