Skip to content

MedCohort

MedCohort is a data structure with helpers to deal with a cohort, a list of individuals with (dated) outcomes and followup times.

MedCohort contatins a vector of basic records (CohortRec), each representing a single period for a specific id (with a corresponding outcome) information. A MedCohort can be sampled to generate MedSamples files according to SamplingParams using one of two fuctions:

  • int create_sampling_file(SamplingParams &s_params, string out_sample_file) : Generate samples within cohort times that fit SampleingParams criteria and windows. Sample dates are selected randomly for each window of s_params.jump_days in the legal period. 
  • int create_sampling_file_sticked(SamplingParams &s_params, string out_sample_file) : Generate samples within cohort times that fit SampleingParams criteria and windows. Sample dates are those with the required signals for each window of s_params.jump_days in the legal period (if existing). A MedCohort can also be used to estimate the age and gender dependent incidence rate. Estimation is done using the following function which according to IncidenceParams:
  • int create_incidence_file(IncidenceParams &i_params, string out_file) : Generate an incidence file from cohort + incidence-params. Check all patient-years within cohort that fit IncidenceParams and count positive outcomes within the incidence_years_window. IncidenceParams initialization:
Parameter Name Description Default Value
incidence_years_window how many years ahead do we consider an outcome? 1
rep Repository configration file None
from_year first year to consider in calculating incidence 2007
to_year last year to consider in calculating incidence 2013
gender_mask mask for gender specification (rightmost bit on for male, second for female) 0x3
train_mask mask for TRAIN-value specification (three rightmost bits for TRAIN = 1,2,3) 0x7
from_age minimal age to consider 30
to_age maximal age to consider 90
age_bin binning of ages 5
min_samples_in_bin minimal required samples to estimate incidence per bin 20

SamplingParams initialization:

Parameter Name Description Default Value
is_continous continous mode of sampling vs. stick to signal (0 = stick) 1
stick_to, stick_to_sigs comma separated list of signals required at sampling times None
take_all in 'stick' mode - take all samples with requrired-signal within each sampling period is selected 0
take_closest

in 'stick' mode - take the sample with requrired-signals that is closest to each target sampling-date

if none of take_all and take_closest is given, a random sample with requrired-signal within each sampling period is selected

0
rep Repository configration file None
min_age minimum age for sampling 0
max_age maximum age for sampling 200
gender_mask mask for gender specification (rightmost bit on for male, second for female) 0x3
train_mask mask for TRAIN-value specification (three rightmost bits for TRAIN = 1,2,3) 0x7
min_year first year for sampling 1900
max_year last year for sampling 2100
jump_days days to jump between sampling periods 180
min_days, min_days_from_outcome minimal number of days before outcome for sampling 30
min_case, min_case_years minimal number of years before outcome for cases 0
max_case, max_case_years maximal number of years before outcome for cases 1
min_control, min_control_years minimal number of years before outcome for controls 0
max_control, max_control_years maximal number of years before outcome for controls 10

Include file is - H:/MR/Libs/Internal/MedUtils/MedUtils/MedCohort.h