Lecture 9: Global Explanations

Munir Hiabu

Interpretability

While Shapley values explain the model at a local scale, e.g. they answer the question of why the model predicted what it did for a single observation, they don’t provide an easily interpretable description for the general trends caught in the model.
In this lecture we will look at global post-hoc explanations.

Global explanations ICE plots

Definition [Individual Conditional Expectation (ICE) plots]

The ICE plots for feature \(k\) are defined as \[ \eta_i(x_k)=\hat m_n(x_k, X_{i,-k}), \quad i=1,\dots,n. \]

Each curve \(\eta_i(x_k)\) displays how a feature \(x_k\) affects an individual given that the other features remain unchanged.
If all curves \(\eta_i(x_k)\), \(i=1,\dots, n\) are plotted together this may give insight into the effect of \(x_k\) in different conditions.
A drawback can be that similar to the case of interventional SHAP values, some parts of the plots are the result of extrapolation, making the values questionable in those areas.

Global explanations ICE plots

ICE plots after fitting Biking data with ranger.

Global explanations PDP

Definition [Partial dependence plots (PDP)]

The PDP plot for feature \(k\) is defined as \[ \xi_k(x_k) = \mathbb E_X [\hat m_n(x_k,X_{-k})]. \] Its empirical version is given as \[ \hat \xi_k(x_k)= \frac 1 n \sum_{i=1}^n \eta_i(x_k) = \frac 1 n \sum_{i=1}^n \hat m_n(x_k, X_{i,-k}). \] More generally, the PDP plot for a set of features \(S\subseteq I_p\) is defined as \[ \xi_S(x_S) = \mathbb E_X [\hat m_n(x_S,X_{-S})] = \int \hat{m}(x_S, x_{-S})p_{X_{-S}}(x_{-S}).\]

The empirically version is given as \[ \hat{\xi}_S(x_S) = \frac{1}{n}\sum_{i=1}^n \hat{m}(x_S, X_{i,-S}).\]

Global explanations PDP

PDP after fitting Biking data with ranger.

The Extrapolation Issue

Assume that \(p=2\) and that \(X_1\) and \(X_2\) are highly correlated: \[ X_1= U+ N(0, 0.05^2), \quad X_2= U+ N(0, 0.05^2), \quad U=\textit{Unif}[0,1]. \]
A simulated scatterplot from 100 observation is shown below
- The empirical partial dependence plot and the empirical value function of interventional SHAP for \(X_1\) at \(x_1 = 0.3\) is calculated by taking a simple average of predictions on the blue dots

The Extrapolation Issue

The problem: The integral in the definition of \(v_x(S)=\xi_S(x_S)\) runs over the support of \(p_{X_{-S}}\).
One solution: Modify the definition and let the bounds of the integral depend on the support of \(X\), \(\mathcal X\subseteq \mathbb R^p\): \[ \tilde \xi_S(x_S) = \frac{ \int_{x_{-S}: (x_S,x_{-S})\in \mathcal X} \hat{m}_n(x_S, x_{-S})p_{X_{-S}}(x_{-S}) \mathrm dx_{-S}}{\int_{x_{-S}: (x_S,x_{-S})\in \mathcal X} p_{X_{-S}}(x_{-S}) \mathrm dx_{-S} }. \]
This solution does not seem to be very popular yet in applications and one would need to develop an empirical version of \(\tilde \xi_S(x_S)\) that is stable and can be calculated fast. One such proposal can be found in Taufiq, Blöbaum, and Minorics (2023) .

References

Taufiq, Muhammad Faaiz, Patrick Blöbaum, and Lenon Minorics. 2023. “Manifold Restricted Interventional Shapley Values.” In International Conference on Artificial Intelligence and Statistics, 5079–5106. PMLR.