ML Engineering

Interpretability — Understanding Why Your Model Predicts What It Does

Dive into model interpretability techniques that help you understand, explain, and trust your machine learning models. Learn SHAP, LIME, and other explainability methods.

SHAP Values — Game theory-based feature importance
LIME — Local interpretable model-agnostic explanations
Partial Dependence Plots — Visualizing feature effects

"If you can't explain it, you don't understand it well enough."

Model Interpretability — Complete Guide

Interpretability explains why a model makes specific predictions. Essential for trust, debugging, and regulatory compliance.

Interpretability Methods

DfInterpretability

Interpretability is the extent to which a human can understand the reasoning behind a model's predictions. It encompasses both global (model-level) and local (prediction-level) explanations.

Global (model-level):

Feature importance (tree-based)
Permutation importance
Partial dependence plots
SHAP summary plots

Local (prediction-level):

LIME
SHAP waterfall plots
Counterfactual explanations
Anchors

Interpretability Spectrum

SHAP Implementation

import shap

# TreeExplainer for tree models
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

# Summary plot
shap.summary_plot(shap_values, X_test)

# Force plot (single prediction)
shap.force_plot(explainer.expected_value, shap_values[0], X_test.iloc[0])

# Dependence plot
shap.dependence_plot("feature_name", shap_values, X_test)

SHAP Value (Shapley Value)

\phi_i = \sum_{S \\subseteq N \\setminus \\{i\\}} \frac{|S|!(|N|-|S|-1)!}{|N|!} [f(S \cup \\{i\\}) - f(S)]

Here,

$\phi_i$ =SHAP value for feature i
$N$ =Set of all features
$S$ =Subset of features not including i
$f(S)$ =Model prediction using features in S

SHAP Waterfall Plot Visualization

LIME Implementation

from lime.lime_tabular import LimeTabularExplainer

explainer = LimeTabularExplainer(
    X_train.values,
    feature_names=feature_names,
    class_names=['Not Fraud', 'Fraud']
)

# Explain single prediction
explanation = explainer.explain_instance(
    X_test.iloc[0].values,
    model.predict_proba,
    num_features=10
)
explanation.show_in_notebook()

LIME Local Approximation

Key Takeaways

Summary: Model Interpretability

SHAP provides theoretically sound feature attributions
LIME creates local interpretable explanations
Feature importance shows global feature relevance
Partial dependence plots show feature effects
Counterfactuals explain "what would need to change"
Model-agnostic methods work with any model
Interpretability is required by law (GDPR, EU AI Act)
Use interpretability for debugging and trust-building

What to Learn Next

-> ML Ethics — Fairness, Bias, Interpretability and Responsible AI Learn about ml ethics — fairness, bias, interpretability and responsible ai.

-> Random Forest — Complete Guide for Ensemble Learning Learn about random forest — complete guide for ensemble learning.

-> Decision Trees — Complete Guide with Visualizations Learn about decision trees — complete guide with visualizations.

-> XGBoost and Gradient Boosting — Complete Guide Learn about xgboost and gradient boosting — complete guide.

-> Model Evaluation — Metrics, Cross-Validation and Selection Learn about model evaluation — metrics, cross-validation and selection.

-> ML System Design — Architecture and Production Patterns Learn about ml system design — architecture and production patterns.

Model Interpretability — SHAP, LIME and Explainable AI