Test 13 - But Why (Shapley)
Purpose
Produce Shapley explanations and global feature-importance visualizations for the model using the test cohort. Helps identify which features drive predictions and create per-feature explainability graphs.
Required Inputs
WORK_DIR: working directory containing repository and model artifactsMODEL_PATH: path to the fitted model (expected${WORK_DIR}/model/model.medmdl)${WORK_DIR}/rep/test.repositoryand${WORK_DIR}/Samples/3.test_cohort.samples
How to Run
What This Test Does
- Requests SHAP values from Flow twice: grouped by signal category and ungrouped (full features). Outputs are written to
${WORK_DIR}/ButWhy/shapley_grouped.reportand${WORK_DIR}/ButWhy/shapley.report. - Uses
feature_importance_printer.pyto generate global HTML reports:${WORK_DIR}/ButWhy/Global.html(grouped)${WORK_DIR}/ButWhy/Global.ungrouped.html
- Generates per-feature HTML files under
${WORK_DIR}/ButWhy/single_features/by expanding the ungrouped SHAP report. - Patches HTML files to use a local Plotly JS (
../js/plotly.js).
Output Location
${WORK_DIR}/ButWhy/- containsshapley.report,shapley_grouped.report,Global.html,Global.ungrouped.html, andsingle_features/HTMLs.
How to Interpret Results
Global.htmlgives a ranked list of feature groups and their aggregate importance. Use it to identify which signal categories contribute most to the model.single_features/contains per-feature SHAP distribution plots useful for detailed investigation.
Troubleshooting
- Missing model or repository: ensure
MODEL_PATHand${WORK_DIR}/rep/test.repositoryexist. Flowfailures: run theFlow --shap_val_requestcommand manually to debug arguments (e.g.).- Plotly not loading: ensure
../js/plotly.jsexists relative to the generated HTML; the test rewrites script tags to../js/plotly.js.
Files to inspect
${WORK_DIR}/ButWhy/shapley.report${WORK_DIR}/ButWhy/shapley_grouped.report${WORK_DIR}/ButWhy/Global*.html${WORK_DIR}/ButWhy/single_features/*.html