Simulator
Goal
Simulate expected performance of a frozen model under a new environment (covariate/covariance shift). The simulator lets you specify target population characteristics (age, sex, availability of signals, etc.) and estimates how model performance will change.
Approach (high level)
The simulator reweights or subsamples an existing labeled dataset (where ground-truth and original performance are known) to match a user-defined target population. This is a statistical, not machine-learning, adjustment - conceptually similar to inverse-probability weighting but using an explicit target population definition rather than learned propensities.
Example: if the original population age range is uniform 40-80 and the target environment is 50-80, patients aged 40-49 receive zero weight in the estimation and the performance metrics are computed on the reweighted population.
This method generalizes to multi-dimensional scenarios (age, sex, signal missingness, etc.) and gives an accurate estimate of expected performance when the target population is well specified.
Code location
The simulator is implemented in the MR_Tools repository under: AlgoMarker_python_API/PopulationAnalyzer
Slides and documentation:
Running the server
From the simulator directory start the UI server:
Default port: 3764. Use the full path to ui.py if you run it from another working directory.
Adding a new AlgoMarker
To register a new AlgoMarker in the simulator UI:
- Copy an existing AlgoMarker Python file (for example
LungFlag.py) into thealgomarkers/folder. - The chosen filename (without.py) will appear in the UI. Filenames may use_SLASH_to show a/in the UI. - Create or edit the AlgoMarker config Python file (the module the UI imports) and define the following fields:
- am_regions: dict mapping region keys to
ReferenceInfoobjects (paths to reference matrices, repository paths, CV results, etc.). - sample_per_pid: numeric bootstrap parameter (how many samples per patient).
- default_region: optional default region key.
- additional_info: short descriptive text shown near the model selector.
- optional_signals: optional list of InputSignal/InputSignalsExistence objects describing extra input groups (e.g., Smoking, Labs, BMI).
- model_path: path to the model file used by the simulator.
- orderdinal: optional integer to order this AlgoMarker in the UI.
- am_regions: dict mapping region keys to
These fields are used by the server to load reference matrices, build cohorts, and run the simulation UI.
Example configuration
An example (abridged) config is included below. The full example in the original file shows cohort filters, region definitions, and optional signals.