Test 05: But Why
Purpose
Provide explanations for model predictions using Shapley values, helping users understand which features drive decisions and how feature values influence outcomes.
Required Inputs
From configs/env.sh:
WORK_DIR: Output directory for resultsMODEL_PATH: Path to the modelREPOSITORY_PATH: Path to the data repositoryTEST_SAMPLES: Path to the test samples
How to Run
From your TestKit folder, execute:
What This Test Does
- Calculates Shapley values for individual predictions
- Aggregates feature importance statistics
- Generates visual explanations for both global and local feature effects
Output Location
Results are saved under $WORK_DIR/ButWhy:
- Global.html: Global signal importance in the model
- Global.ungrouped.html: Global feature importance (ungrouped)
- single_features/: Directory containing analysis for each important feature
- For each feature:
- Stratification plots showing how feature values affect model response
- Mean outcome for each feature value (probability of being a case)
- Mean score for each feature value (should align with outcome graph)
- Mean and confidence interval of Shapley value for each feature value (may differ from outcome in some cases)
How to Interpret Results
- Use global plots to identify the most influential features
- Review single feature analyses to understand how feature values impact predictions and Shapley values
- Look for expected patterns (e.g., U-shaped risk curves for age in Flu Complications) and ensure the model's logic is reasonable