import sys
import os
Visualizing tree-based classifiers
The dtreeviz library is designed to help machine learning practitioners visualize and interpret decision trees and decision-tree-based models, such as gradient boosting machines.
The purpose of this notebook is to illustrate the main capabilities and functions of the dtreeviz API. To do that, we will use scikit-learn and the toy but well-known Titanic data set for illustrative purposes. Currently, dtreeviz supports the following decision tree libraries:
To interopt with these different libraries, dtreeviz uses an adaptor object, obtained from function dtreeviz.model()
, to extract model information necessary for visualization. Given such an adaptor object, all of the dtreeviz functionality is available to you using the same programmer interface. The basic dtreeviz usage recipe is:
- Import dtreeviz and your decision tree library
- Acquire and load data into memory
- Train a classifier or regressor model using your decision tree library
- Obtain a dtreeviz adaptor model using
viz_model = dtreeviz.model(your_trained_model,...)
- Call dtreeviz functions, such as
viz_model.view()
orviz_model.explain_prediction_path(sample_x)
The four categories of dtreeviz functionality are:
- Tree visualizations
- Prediction path explanations
- Leaf information
- Feature space exploration
We have grouped code examples by classifiers and regressors, with a follow up section on partitioning feature space.
These examples require dtreeviz 2.0 or above because the code uses the new API introduced in 2.0.
Setup
%config InlineBackend.figure_format = 'retina' # Make visualizations look good
#%config InlineBackend.figure_format = 'svg'
%matplotlib inline
if 'google.colab' in sys.modules:
!pip install -q dtreeviz
import pandas as pd
from sklearn.tree import DecisionTreeClassifier, DecisionTreeRegressor
import dtreeviz
= 1234 # get reproducible trees random_state
Load Sample Data
= "https://raw.githubusercontent.com/parrt/dtreeviz/master/data/titanic/titanic.csv"
dataset_url = pd.read_csv(dataset_url)
dataset # Fill missing values for Age
"Age":dataset.Age.mean()}, inplace=True)
dataset.fillna({# Encode categorical variables
"Sex_label"] = dataset.Sex.astype("category").cat.codes
dataset["Cabin_label"] = dataset.Cabin.astype("category").cat.codes
dataset["Embarked_label"] = dataset.Embarked.astype("category").cat.codes dataset[
To demonstrate classifier decision trees, we trying to model using six features to predict the boolean survived target.
= ["Pclass", "Age", "Fare", "Sex_label", "Cabin_label", "Embarked_label"]
features = "Survived"
target
= DecisionTreeClassifier(max_depth=3, random_state=random_state)
tree_classifier tree_classifier.fit(dataset[features].values, dataset[target].values)
DecisionTreeClassifier(max_depth=3, random_state=1234)
Initialize dtreeviz model (adaptor)
To adapt dtreeviz to a specific model, use the model()
function to get an adaptor. You’ll need to provide the model, X/y data, feature names, target name, and target class names:
= dtreeviz.model(tree_classifier,
viz_model =dataset[features], y_train=dataset[target],
X_train=features,
feature_names=target, class_names=["perish", "survive"]) target_name
We’ll use this model to demonstrate dtreeviz functionality in the following sections; the code will look the same for any decision tree library once we have this model adaptor.
Tree structure visualizations
To show the decision tree structure using the default visualization, call view()
:
=0.8) viz_model.view(scale
To change the visualization, you can pass parameters, such as changing the orientation to left-to-right:
="LR") viz_model.view(orientation
To visualize larger trees, you can reduce the amount of detail by turning off the fancy view:
=False) viz_model.view(fancy
Another way to reduce the visualization size is to specify the tree depths of interest:
=(1, 2)) # root is level 0 viz_model.view(depth_range_to_display
Prediction path explanations
For interpretation purposes, we often want to understand how a tree behaves for a specific instance. Let’s pick a specific instance:
= dataset[features].iloc[10]
x x
Pclass 3.0
Age 4.0
Fare 16.7
Sex_label 0.0
Cabin_label 145.0
Embarked_label 2.0
Name: 10, dtype: float64
and then display the path through the tree structure:
=x) viz_model.view(x
=x, show_just_path=True) viz_model.view(x
You can also get a string representation explaining the comparisons made as an instance is run down the tree:
print(viz_model.explain_prediction_path(x))
2.5 <= Pclass
Fare < 23.35
Sex_label < 0.5
If you’d like the feature importance for a specific instance, as calculated by the underlying decision tree library, use instance_feature_importance()
:
=(3.5,2)) viz_model.instance_feature_importance(x, figsize
Leaf info
There are a number of functions to get information about the leaves of the tree.
=(3.5,2)) viz_model.leaf_sizes(figsize
=(3.5,2)) viz_model.ctree_leaf_distributions(figsize
=6) viz_model.node_stats(node_id
Pclass | Age | Fare | Sex_label | Cabin_label | Embarked_label | |
---|---|---|---|---|---|---|
count | 117.0 | 117.0 | 117.0 | 117.0 | 117.0 | 117.0 |
mean | 3.0 | 23.976667 | 11.722829 | 0.0 | 6.196581 | 1.34188 |
std | 0.0 | 10.534377 | 4.695136 | 0.0 | 31.167855 | 0.789614 |
min | 3.0 | 0.75 | 6.75 | 0.0 | -1.0 | 0.0 |
25% | 3.0 | 18.0 | 7.775 | 0.0 | -1.0 | 1.0 |
50% | 3.0 | 27.0 | 9.5875 | 0.0 | -1.0 | 2.0 |
75% | 3.0 | 29.699118 | 15.5 | 0.0 | -1.0 | 2.0 |
max | 3.0 | 63.0 | 23.25 | 0.0 | 145.0 | 2.0 |
=(3.5,2)) viz_model.leaf_purity(figsize
Feature Space Partitioning
Decision trees partition feature space in such a way as to maximize target value purity for the instances associated with a node. It’s often useful to visualize the feature space partitioning, although it’s not feasible to visualize more than a couple of dimensions.
Classification
To visualize how it decision tree partitions a single feature, let’s train a shallow decision tree classifier using the toy Iris data.
from sklearn.datasets import load_iris
= load_iris()
iris = list(iris.feature_names)
features = iris.target_names
class_names = iris.data
X = iris.target y
= DecisionTreeClassifier(max_depth=2, min_samples_leaf=1, random_state=666)
dtc_iris dtc_iris.fit(X, y)
DecisionTreeClassifier(max_depth=2, random_state=666)
= dtreeviz.model(dtc_iris,
viz_model =X, y_train=y,
X_train=features,
feature_names='iris',
target_name=class_names) class_names
The following diagram indicates that the decision tree splits the petal width feature into three mostly-pure regions (using random_state
above to get the same tree each time):
={'splits','title'}, features=['petal width (cm)'],
viz_model.ctree_feature_space(show=(5,1)) figsize
=40, gtype='barstacked', show={'splits','title'}, features=['petal width (cm)'],
viz_model.ctree_feature_space(nbins=(5,1.5)) figsize
A deeper tree gives this finer grand partitioning of the single feature space:
={'splits','title'}, features=['petal width (cm)'],
viz_model.ctree_feature_space(show=(5,1)) figsize
Let’s look at how a decision tree partitions two-dimensional feature space.
={'splits','title'}, features=['petal length (cm)', 'petal width (cm)']) viz_model.ctree_feature_space(show