Interpretability and explainability

Machine learning algorithms are about finding correlation in data. In contrast, Interpretable machine learning is about understanding causality.

Applying domain knowledge requires understanding of the model and how to interpret its output.

Often, we need to understand individual predictions a model is making.

For instance, a model might:

Usefulness of interpretability

We need to understand differences on a datasets level:

Model debugging. We might want to understand why a model that worked previously are not working when applied to new data.

Dimensions of interpretability

image.png

image.png

What about decision trees and rules?