Visualizing tree-based classifiers#

(View this notebook in Colab)

The dtreeviz library is designed to help machine learning practitioners visualize and interpret decision trees and decision-tree-based models, such as gradient boosting machines.

The purpose of this notebook is to illustrate the main capabilities and functions of the dtreeviz API. To do that, we will use scikit-learn and the toy but well-known Titanic data set for illustrative purposes. Currently, dtreeviz supports the following decision tree libraries:

To interopt with these different libraries, dtreeviz uses an adaptor object, obtained from function dtreeviz.model(), to extract model information necessary for visualization. Given such an adaptor object, all of the dtreeviz functionality is available to you using the same programmer interface. The basic dtreeviz usage recipe is:

  1. Import dtreeviz and your decision tree library

  2. Acquire and load data into memory

  3. Train a classifier or regressor model using your decision tree library

  4. Obtain a dtreeviz adaptor model using
    viz_model = dtreeviz.model(your_trained_model,...)

  5. Call dtreeviz functions, such as
    viz_model.view() or viz_model.explain_prediction_path(sample_x)

The four categories of dtreeviz functionality are:

  1. Tree visualizations

  2. Prediction path explanations

  3. Leaf information

  4. Feature space exploration

We have grouped code examples by classifiers and regressors, with a follow up section on partitioning feature space.

These examples require dtreeviz 2.0 or above because the code uses the new API introduced in 2.0.

Setup#

import sys
import os
%config InlineBackend.figure_format = 'retina' # Make visualizations look good
#%config InlineBackend.figure_format = 'svg'
%matplotlib inline

if 'google.colab' in sys.modules:
  !pip install -q dtreeviz
import pandas as pd
from sklearn.tree import DecisionTreeClassifier, DecisionTreeRegressor

import dtreeviz

random_state = 1234 # get reproducible trees

Load Sample Data#

dataset_url = "https://raw.githubusercontent.com/parrt/dtreeviz/master/data/titanic/titanic.csv"
dataset = pd.read_csv(dataset_url)
# Fill missing values for Age
dataset.fillna({"Age":dataset.Age.mean()}, inplace=True)
# Encode categorical variables
dataset["Sex_label"] = dataset.Sex.astype("category").cat.codes
dataset["Cabin_label"] = dataset.Cabin.astype("category").cat.codes
dataset["Embarked_label"] = dataset.Embarked.astype("category").cat.codes

To demonstrate classifier decision trees, we trying to model using six features to predict the boolean survived target.

features = ["Pclass", "Age", "Fare", "Sex_label", "Cabin_label", "Embarked_label"]
target = "Survived"

tree_classifier = DecisionTreeClassifier(max_depth=3, random_state=random_state)
tree_classifier.fit(dataset[features].values, dataset[target].values)
DecisionTreeClassifier(max_depth=3, random_state=1234)

Initialize dtreeviz model (adaptor)#

To adapt dtreeviz to a specific model, use the model() function to get an adaptor. You’ll need to provide the model, X/y data, feature names, target name, and target class names:

viz_model = dtreeviz.model(tree_classifier,
                           X_train=dataset[features], y_train=dataset[target],
                           feature_names=features,
                           target_name=target, class_names=["perish", "survive"])

We’ll use this model to demonstrate dtreeviz functionality in the following sections; the code will look the same for any decision tree library once we have this model adaptor.

Tree structure visualizations#

To show the decision tree structure using the default visualization, call view():

viz_model.view(scale=0.8)
../../../../_images/b9571e4fd0b83d06b20fb8deb68ac47171fa9bfe2e6bbc560b0e87b634d88e8d.svg

To change the visualization, you can pass parameters, such as changing the orientation to left-to-right:

viz_model.view(orientation="LR")
../../../../_images/666e140ca75fb02209cd3ec3dc91f0934d7ba39047af8e6d500c5ab5e6756b9e.svg

To visualize larger trees, you can reduce the amount of detail by turning off the fancy view:

viz_model.view(fancy=False)
../../../../_images/7fd19ed35377997fbb69dbc0180e0dd158af6dec4e7c84c8233a4c87d34a153e.svg

Another way to reduce the visualization size is to specify the tree depths of interest:

viz_model.view(depth_range_to_display=(1, 2)) # root is level 0
../../../../_images/1ff9e657e41ac7965046b18e5651ebda1c00693f7cfd53d0ab758c735aefb716.svg

Prediction path explanations#

For interpretation purposes, we often want to understand how a tree behaves for a specific instance. Let’s pick a specific instance:

x = dataset[features].iloc[10]
x
Pclass              3.0
Age                 4.0
Fare               16.7
Sex_label           0.0
Cabin_label       145.0
Embarked_label      2.0
Name: 10, dtype: float64

and then display the path through the tree structure:

viz_model.view(x=x)
../../../../_images/07dfc1a52a28e1bd0d91902e070be393919b3d320f9104ea1df75b438b354d9b.svg
viz_model.view(x=x, show_just_path=True)
../../../../_images/d576b7c380523d0a1c617cf30bb356a4a9df51e55eef1ead648a43175de82e09.svg

You can also get a string representation explaining the comparisons made as an instance is run down the tree:

print(viz_model.explain_prediction_path(x))
2.5 <= Pclass 
Fare < 23.35
Sex_label < 0.5

If you’d like the feature importance for a specific instance, as calculated by the underlying decision tree library, use instance_feature_importance():

viz_model.instance_feature_importance(x, figsize=(3.5,2))
../../../../_images/ee50d7e5ddcf93a4849048375fda6c44fabd8152d1c63a755eecdfa13f925d50.png

Leaf info#

There are a number of functions to get information about the leaves of the tree.

viz_model.leaf_sizes(figsize=(3.5,2))
../../../../_images/793e1410c773b49b415f258f59057f17be7fcd9d501ccb4c2a070259f3c08f7f.png
viz_model.ctree_leaf_distributions(figsize=(3.5,2))
../../../../_images/237060420e480c01ef20a0e9ffef2434e47294a909636b3757cb66ad2735dea4.png
viz_model.node_stats(node_id=6)
Pclass Age Fare Sex_label Cabin_label Embarked_label
count 117.0 117.0 117.0 117.0 117.0 117.0
mean 3.0 23.976667 11.722829 0.0 6.196581 1.34188
std 0.0 10.534377 4.695136 0.0 31.167855 0.789614
min 3.0 0.75 6.75 0.0 -1.0 0.0
25% 3.0 18.0 7.775 0.0 -1.0 1.0
50% 3.0 27.0 9.5875 0.0 -1.0 2.0
75% 3.0 29.699118 15.5 0.0 -1.0 2.0
max 3.0 63.0 23.25 0.0 145.0 2.0
viz_model.leaf_purity(figsize=(3.5,2))
../../../../_images/a4cff18c5323f23199ed1c8dcfecdee7225788e988e72fb9721e48d674f9c263.png

Feature Space Partitioning#

Decision trees partition feature space in such a way as to maximize target value purity for the instances associated with a node. It’s often useful to visualize the feature space partitioning, although it’s not feasible to visualize more than a couple of dimensions.

Classification#

To visualize how it decision tree partitions a single feature, let’s train a shallow decision tree classifier using the toy Iris data.

from sklearn.datasets import load_iris
iris = load_iris()
features = list(iris.feature_names)
class_names = iris.target_names
X = iris.data
y = iris.target
dtc_iris = DecisionTreeClassifier(max_depth=2, min_samples_leaf=1, random_state=666)
dtc_iris.fit(X, y)
DecisionTreeClassifier(max_depth=2, random_state=666)
viz_model = dtreeviz.model(dtc_iris,
                           X_train=X, y_train=y,
                           feature_names=features,
                           target_name='iris',
                           class_names=class_names)

The following diagram indicates that the decision tree splits the petal width feature into three mostly-pure regions (using random_state above to get the same tree each time):

viz_model.ctree_feature_space(show={'splits','title'}, features=['petal width (cm)'],
                             figsize=(5,1))
../../../../_images/726eb0fd62ce819c59e3e318a90a58bf74d4e82a054a394321f54625fe209adf.png
viz_model.ctree_feature_space(nbins=40, gtype='barstacked', show={'splits','title'}, features=['petal width (cm)'],
                             figsize=(5,1.5))
../../../../_images/0fb081ee141339099b0035bbfe42bc82f223522fe4897f3221ac43ec0d811577.png

A deeper tree gives this finer grand partitioning of the single feature space:

viz_model.ctree_feature_space(show={'splits','title'}, features=['petal width (cm)'],
                              figsize=(5,1))
../../../../_images/8d7e980ea3e2af011e0d640be232228e38fae3bb6f0dc22750339b8a2692a7f2.png

Let’s look at how a decision tree partitions two-dimensional feature space.

viz_model.ctree_feature_space(show={'splits','title'}, features=['petal length (cm)', 'petal width (cm)'])
../../../../_images/b1004c4a701bbb8ea52f969aadb3105cfc63fb947053d58c9a7b7caea797aff5.png