We see AI models excelling at complex tasks and delivering impressive predictions. While this is undoubtedly remarkable, it raises an important question: how can we ensure the predictions are unbiased and trustworthy? This is where Explainable AI comes into play

What is Explainable AI ?

Explainable AI(XAI) is a critical field in artificial intelligence which is gaining traction every minute as we speak and is extremely crucial to make decision-making process of AI models transparent and understandable to humans. It is like opening a “black-box” of AI model and allowing us to see how it reaches its conclusions and understanding its reasoning.

Why XAI matters?

Transparency – XAI helps build trust in AI systems by providing users with insights into how decisions are made. This is crucial in applications where consequences are significant, such as healthcare, finance, and autonomous vehicles.

Accountability – Explainable AI can help identify biases in AI models, ensuring decisions are fair and equitable.

Improved Model Development – Understanding the reason behind an AI model’s prediction can lead to improvement in model fluency and overall performance.

Techniques in XAI

There are different techniques that are employed in Explainable AI. These are broadly divided into two types, local explanations and global explanations.

Local Explanations: This technique explains the predictions of the model based on the individual data points. Examples include, LIME and SHAP¹ .
Global explanations – Provides insights into over all decision making behavior of the model. Examples include Decision trees and Rule extraction.

These are just a few examples of the many XAI techniques and to add to the above list there are other interesting techniques like Counterfactual explanations, Attention Mechanisms, Layer visualization in deep learning.

As part of this blog, I would like to deep dive into XAI’s local explanations technique, SHAP and how to use it.

SHAP – SHapely Additive exPlanations

SHAP is a versatile framework designed to explain output of machine learning models.It works by assessing contribution of each feature to the model’s prediction. This framework leverages Shapley values², a concept from cooperative game theory, to quantify these contributions. SHAP aims to provide a more rigorous and theoretically-grounded way of computing feature importance compared to other techniques.

Implementing SHAP:

Choose a model to explain and a specific prediction that needs a explanation.
Use the SHAP library to compute the Shapley values for the input features.
The Shapley values represent the contribution of each feature to the model’s prediction.

SHAP technique is readily available as a python library “shap“. Follow along below example to explore more on how to use this library to understand the importance and contribution of each feature to the predictions made.

To view the full code and visualization graphs, feel free to visit my github – SHAP_diabetes_DS_XAI.ipynb

Interpretation of SHAP (Shapley) values

Before going into details , let’s look at what is a baseline value.

Baseline Value – This is the average model prediction across the entire dataset. SHAP values explain how much each feature deviates from this baseline to produce the specific prediction. The reason why baseline matter is that it anchors the interpretations.

For example:

In a regression problem (e.g., predicting blood sugar level), the baseline might be the average blood sugar level across all patients in the dataset.
In a classification problem (e.g., predicting the probability of diabetes), the baseline might be the average probability across the dataset.

Predicted Value – It is the Model’s predicted value

A higher prediction than the baseline indicates positive contributions from some features.
A lower prediction than the baseline indicates negative contributions from some features.

Libraries like SHAP automatically compute this expected value (baseline value).

$\Large \text{Model Prediction} = \text{Baseline Value} + \sum_{i=1}^{n} \text{SHAP Value}_i$

Mathematically, the sum of SHAP values across all features(Additivity) for a single prediction equals the difference between the model’s output for that instance and the baseline.

Positive vs Negative SHAP values

Positive SHAP values – Indicate that given feature pushed a prediction higher compared to baseline value.

Negative SHAP Values: Indicate the feature pushes the prediction lower compared to the baseline value.

Example: Diabetes Prediction

Scenario:

Predicting the progression of diabetes and higher values mean worse progression.
Baseline: Average progression ³value in the training data is 150.

Prediction:

For a specific patient, the model predicts 120, meaning the patient is predicted to have lower-than-average progression.

Feature	Feature Value	SHAP Value
BMI	25	+10
Blood Sugar Level – S2	80	-20
Age	45	-15

                                            Prediction = 150 + 10−20−15 = 120

Blood Sugar and Age push the prediction down (lowering the risk or progression), while BMI pushes it slightly up.

Wrapping up!

Explainable AI is key to ensuring that as AI models become more sophisticated, they remain understandable, reliable, and free from biases.

By using local explanations, such as SHAP, we can break down individual predictions and gain insights into how specific features influence outcomes. This is vital not only for improving the fairness and accountability of AI systems but also for building trust in AI applications, particularly in high-stakes domains like healthcare, finance, and autonomous systems.

I hope you enjoyed reading this post as much as I did writing it! I certainly learned a lot in the process. I’ll be diving deeper into the world of XAI in upcoming posts, so stay tuned for more insights.

References:

C. Molnar, Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, 2nd ed., 2022. [Online]. Available: https://christophm.github.io/interpretable-ml-book/

SHAP – SHapely Additive exPlanations ↩︎
Shapley values – In the context of machine learning, Shapley values are used to fairly attribute the contribution of each feature to a model’s prediction. ↩︎
Average progression – It refers to the typical or mean rate at which something progresses ↩︎