action_outcome_effect
The tool can be found at $MR_ROOT/Tools/action_outcome_effect The tool check treatment/action effect on outcome. the main input is MedSamples+json to create matrix or MedFeatures(the matrix itself) with list of confounders.
Disclaimers:
- We need to list all covariates that effect action\treatment. because it's human based decision it should be doable to list all. If we miss covariate or confounder we may have missleading results
- the matching process may be prone to weighting errors that effect matching or predictor with poor results. we may look at the second_weighted_auc and would like it to be as close as it can to 0.5 (0.5 is perfect match, no more information in the covariates left)
- We need strong ignorabilty (so except to 1st point for listing all covariates) we need all the population to have probability for action\treatment that is not pure 0 or 1. we have defined cutoff probability to drop patients who have no dilemas. so the final population in which we show results is different from the requested original (probablity no patients who are very healthy). we may look at the covariated distribution in the new population after drop
- the action\treatment may effect indirectly on outcome through other covariates. For example taking statins will lower your LDL value and that's what lower your risk for stroke\MI. It's important to understand that if you have 2 patient with dilema for treatment the treatment effect may occour indirectly by the treatment and we also measure that. We are not testing for direct treatment effect only
What the tool does?
- Selects a model thats when using it's prediction score on cross validation sqeeze all the information in the covariates:If we do inverse probabilty reweighting (similar method like matching to match populations) and try to learn validation model with cross validation we reach low AUC.It selects the model that the secondry validaiton model after the matching achieves the worst AUC.
- Trains a model for predicting the action\treatment and calibrates the scores to probabilty
- Matches or Reweight with the model score to cancel the confounders - it drops patients who are only treated\only untreated because we can't measure treatment effect on them.It also show comperasion of the populations before and after the matching for each of the covariates
- writes the stats (number of cases,contols with the original outcome) for each of the given groups to compare.
- Does all the process in bootstrap manner to ahve mean, std, CI for each measured number on each group
App Help
get help from the app
example run | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 |
|
you may see that after the matching the secondry model doesn't achieves good results "SECOND_WEIGHTED_AUC = 0.499". but you can see that the population is very different from the original requested one. a predictor can seperated them with AUC=0.751. you can see that the "DM_Registry" has more than doubled it's value from 0.05 to 0.11!! the GOOD/BAD keywords only shows you where the populations are differs. you need to remember that those diffrences are unavoidable - to induce causality you have to look on population with strong ignorabilty! If you have for example very healty patients with LDL = 70, low BMI and without statins (and you never see treated patients with those covariates in the data), you just can't induce treatment effect on them so you have to drop them. the same thing happend on "very sick people". inducing treatment effect only works on grey zones when we have dilemas It also output the results to /tmp/LDL.txt (if nbootstrap==1 all the numbers are without STD, and CI):
Input Arguments:
cat base_config_example.cfg
the input argument is the main data file and it can be MedSamples with json to create matrix or the MedFeatures itself.
- patient_action - a file with same number of lines as the samples in the input, each line is correspond to the same sample in the input. it may be 0/1 for [no treatment, treatment] mark for each sample
- confounders_file - a file list with all the confounders search name (searching contains in the column names) in the matrix of input. each line consist of the the confounder search name
- example: $> head confounder.list
here we have 3 confounders: Age,Gender and LDL.last
- patient_groups_file - a file with same number of lines as the samples in the input, each line is correspond to the same sample in the input. it will be the risk group name for the sample for later split. you may write for example in line: "Age=20-40;Gender=Male" to mark the sample as belong to that group in the output results
- models_selection_file - a file with initialization of the parameters of the model. you may provide more than one option for parameter with "," and than the tool will select the best option over all available options. you may also provide only one option and than the tool will just use this. for example: $> head xgb_model_selection.cfg