Test 15: Compare to Baseline Model
Purpose
Compare your model to a baseline model by evaluating performance, correlation, and which patients are flagged. This analysis highlights differences, improvements, and cases your model captures that the baseline misses.
Required Inputs
From configs/env.sh:
WORK_DIR: Output directory for resultsMODEL_PATH: Path to your modelBASELINE_MODEL_PATH: Path to the baseline modelREPOSITORY_PATH: Path to the data repositoryTEST_SAMPLES: Path to the test samplesBT_JSON: Bootstrap JSON for cohort filteringBT_COHORT: Bootstrap cohort definitionBASELINE_COMPARE_TOP: Top percentage of predictions to compare (default: 5%)
How to Run
From your TestKit folder, execute:
What This Test Does
- Compares top X% of flagged patients between your model and the baseline
- Calculates performance metrics, overlap scores, and correlation
- Trains a propensity model to differentiate patients flagged by each model
- Analyzes which features are used differently by your model vs. baseline
Output Location
$WORK_DIR/ButWhy.baseline/: But Why analysis for baseline model (Global.html, Global.ungrouped.html, single_features)$WORK_DIR/bootstrap/bt_baseline_compare.tsv: Bootstrap results comparison$WORK_DIR/compare_to_baseline/correlation.txt: Correlation of scores between models$WORK_DIR/compare_to_baseline/: Propensity model resultsGlobal.html: Most important signals differentiating flagged patientssingle_features/: But Why analysis for each important featurecompare_rep.txt: Table comparing feature means, std, etc. for flagged patients
$WORK_DIR/compare_to_baseline/baseline_coverage_by_mes_$BASELINE_COMPARE_TOP.html: Coverage/intersection of flagged individuals at different cutoffs$WORK_DIR/compare_to_baseline/compare_Age_top_flagged.html: Comparison of flagged ages by both models
How to Interpret Results
- Use overlap and coverage metrics to see how much your model improves on the baseline
- Review propensity model outputs to understand which features drive differences
- Check correlation and age comparisons for additional insights