Test 13: Model Explainability
Purpose
Explore and interpret model predictions by analyzing real data examples of high-risk patients and identifying the most common reasons for being flagged. The analysis focuses on the top 1000 patients with the highest scores. We want to test both model correctness and our explainability method before production.
Required Inputs
From configs/env.sh:
WORK_DIR: Output directory for resultsEXPLAINABLE_MODEL: Path to the model with explainability featuresREPOSITORY_PATH: Path to the data repositoryTEST_SAMPLES: Path to the test samplesEXPLAIN_JSON: JSON for bootstrap filteringEXPLAIN_COHORT: (Optional) Filter to focus on specific explainability samples
How to Run
From your TestKit folder, execute:
What This Test Does
- Uses MES explainability extension to generate explanations for model behavior (method reference)
- Analyzes the top 1000 high-risk patients to find the most common contributing features
- Provides statistical summaries and detailed examples for exploration
Output Location
$WORK_DIR/ButWhy/explainer_examples/group_stats*.tsv: Summary tables of the most common reasons for high scores (e.g., Smoking, COPD diagnosis, BMI, WBC)test_report.*.tsv: Example reports for high-risk patients, showing grouped rows by risk factor from most to least important
Example Output
group_stats*.tsv (summary of common reasons in high-risk patients):
How to Interpret this file
The most influential risk factors identified by the model are listed below, based on how frequently they appear in a patient's top three reasons for a high risk score.
We can see that the most important risk factor in that model that repeats itself is Smoking - which appears in 99.7% of the times in top 3 reasons - The leading feature inside is Smoking.Smoking_Years. The next contributer is COPD diagnosis that appears 53.8% of the times in top 3 and than BMI - in 40.5% and then WBC 28.2%
test_report.*.tsv (example patient report):
How to Interpret this file
This output file uses SHAP values to explain how each variable contributes to a patient's final risk score. For patient 100192, the risk score was 0.445575, which correctly predicted the Case (Outcome=1).
- Positive SHAP Values (Risk-Increasing): Indicate features that push the score higher (toward greater risk).
- Negative SHAP Values (Protective/Risk-Decreasing): Indicate features that push the score lower.
The patient's risk is heavily dominated by two features:
- Smoking: This is the main contributor, accounting for 27.38% of the total absolute SHAP value sum with a score of +1.51. The core feature, Smoking_Years, is high at 40.13, confirming a long history of smoking.
- WBC (White Blood Cell Count): This is the second-largest factor, contributing +0.708 (or 12.81% of the total). The patient's WBC level is elevated (last measured at 17.5), which significantly increases the risk assessment.
Conversely, a specific diagnosis ICD9_Diagnosis.ICD9_CODE:786 (Symptoms_involving_respiratory_system_and_other_chest_symptoms) is shown to be slightly protective with a SHAP value of -0.11. This negative value is due to the patient lacking this diagnosis (feature value of 0) in the preceding 10 years, which mildly reduces the overall calculated risk.
How to Interpret Results
- Use summary tables to identify the most frequent risk factors among high-risk patients
- Review individual patient reports to understand which features contribute most to their risk scores
- Look for expected patterns (e.g., Smoking as a top risk factor in lung cancer) and ensure explanations are clinically meaningful