create_registry
A tool to create MedRegistry and if provided sampling startegy parameters to create MedSamples. The program steps:
- Creates or load from text file MedRegistry. Can provide file path or config file to generate MedRegistry
- Creates or load from text file MedRegistry for censoring (a time periods where that marks the patients "Membership" period, it's also in format of MedRegistry) - OPTIONAL, if not given, assume the patient has full membership
- Creates MedLabel from registry and censor_registry with problem definition arguments - time window argument + labeling policy arguments. This object knows how to "label" the outcome for a sample in a given time or decide to exclude it based on MedRegistry and the LabelParams.
- Creates MedSamples from MedSamplingStrategyarguments of how to sample + additional filtering arguments you may provide to restrict sampling (Age, Years...)
Create samples from MedRegistry
- LABELING_PARAMS - defines how to label the sample - is it case/control or other outcome value?- The initialization text for MedLabels
- SAMPER_TYPE - type of sampler. The options are in here (code documentation make_sampler) or can look for informaiton here: MedSamplingStrategy
- SAMPLER_ARGS - for the specific Sampler, the arguments for it. MedSamplingStrategy or browse the arguments of the specifc sampler
- OPTIONAL_FILTERING_PARAMS_LIKE_AGE - parameters to filter the samples like age. FilterParams. For example "min_age=0;max_age=90;min_time=20120101;max_time=20180101"
The SAMPLER_TYPE, SAMPLER_ARGS are independent of the labeing - they define just when to give score or try to give score.
Example:
We have Influenza MedRegistry where the patient is marked and control when he has Membership and not influenza. In influenza events we have a 1 day time window of influenza.
We want to see which patient will have flu within 1 year. The censor registry is the membership signal - we require no gaps in the whole year.
The labeling params:
"time_from=0;time_to=365;censor_time_from=0;censor_time_to=365;conflict_method=max;censor_interaction_mode=all:within:within;label_interaction_mode=0:within,within|1:before_start,after_start"
The sampling:
--sampler_type yearly --sampler_args "start_year=2016;end_year=2018;prediction_month_day=901;day_jump=365"Which predicts in 1st of September in each year from 2016 to 2017, is can give score within the specific time for the patient (has a match for a certain registry record)
Create MedRegistry from command
Needs to explain and give some examples on how to create MedRegistry
- REGISTRY_TYPE : either "binary" for binary problems , "categories" (for outcome with more than case/controls states). and "keep_alive" that is mostly used to generate membership period based on patients activity
- REGISTRY_ARGS : the arguments for the registry type. For example "binary" has those options:
- max_repo_date - the maximal repoistory date to cut registry date. in format YYYYMMDD
- start_buffer_duration - buffer duration from first "rule" start. Mostly, set to 0
- end_buffer_duration - buffer duration from last "rule" end duration. In case we want to "trim" the last period. Mostly, set to 0
- allow_prediciton_in_case - if true will continue to process rules even if in "case" time period
- seperate_cases - will allow more than one time period of case. Might be useful fot influenza events that may occur several times.
- config_signals_rules - file path to registry rules to define the MedRegistry records
- Tab delimted file with 2 columns: RegistrySignalTypeand it's arguments
Example config_signals_rules file for CKD from 2 to 3 and up:
- Tab delimted file with 2 columns: RegistrySignalTypeand it's arguments
Example config_signals_rules file for CKD from 2 to 3 and up: