Checking Causal Inference on Synthetic Data

The program for testing causal-inference methods on synthetic data is located in - H:\MR\Projects\Shared\CausalEffects\CausalEffectsUtils\check_toy_model The program parameters are :

check_toy_model --help
Program options:
  --help                                produce help message
  --trainMatrix arg                     train data file (bin)
  --testMatrix arg                      test data file (bin)
  --validationMatrix arg                validation data file (bin)
  --params arg                          serialized true model file
  --validationITE arg                   File of validation ITE
  --out arg                             output file for ITE graph
  --read_models                         read required models from file
  --write_models                        write generated models to file
  --models_prefix arg                   prefix of models for read/write
  --gen_model_params arg (=lightgbm;num_threads=15;num_trees=200;learning_rate=0.05;lambda_l2=0;metric_freq=250;bagging_fraction=0.5;bagging_freq=1;feature_fraction=0.8;max_bin=50;min_data_in_leaf=250;num_leaves=120)
                                        parameters and definition of
                                        classifiers which are not specifically
                                        given
  --gen_reg_params arg (=xgb;alpha=0.1;colsample_bytree=0.5;eta=0.01;gamma=0.5;booster=gbtree;objective=reg:linear;lambda=0.5;max_depth=3;min_child_weight=100;num_round=250;subsample=0.5)
                                        parameters and definition of regressors
                                        which are not specifically given
  --nbootstrap arg (=100)               # of bootstrap rounds
  --nfolds arg (=8)                     # of folds for cross validation
  --gen_nn_params arg (=batch_size=1000;function=relu;max_num_batches=30000;checkpoint_num_batches=250;data=features;keep_prob=0.8;learning_rate=1e-3;nhidden=200;nlayers=2)
                                        parameters for nn regressionscript
  --do_true                             get ITE from true model
  --do_direct                           directly model true ite
  --do_model                            get ITE from single model
  --model_params arg                    parameters and definition of outcome
                                        predictor
  --add_propensity arg (=0)             If True will add propensity score to
                                        direct model for outcome
  --bonus arg (=0)                      bonums for splitting by treatment in
                                        outcome prediction using xgboost
  --do_nn_model                         get ITE from single NN model
  --do_two_models                       get ITE from single model
  --model0_params arg                   parameters and definition of outcome
                                        predictor for untreated
  --mode1_params arg                    parameters and definition of outcome
                                        predictor for treated
  --do_weighted                         get ITE from propensity weighted model
  --prop_params arg                     parameters and defition of propensity
                                        score
  --weighed_model_params arg            parameters and definition of outcome
                                        predictor
  --do_g_comp                           get ITE using g-computation
  --g_comp_params arg                   parameters and definition of outcome
                                        predictor
  --g_comp_prop_params arg              parameters and definition of propensity
                                        predictor
  --g_comp_reg_params arg               parameters and definition of
                                        counter-factuals regressor
  --g_comp_reg_script arg (=/nas1/UsersData/yaron/MR/Tools/quasi_oracle/PythonScripts/ite_predictor.py)
                                        Python script for t-prediction (ITE)
  --g_comp_reg_script_output arg        output for Python script for
                                        counter-factuals regressor
  --g_comp_reg_script_input arg (=g_comp_matrix)
                                        input for Python script for
                                        counter-factuals regressor
  --g_comp_reg_script_params arg        parameters for python script for
                                        counter-factuals regressor
  --g_comp_cf_params arg                parameters and definition of
                                        counter-factuals classifier
  --gNumCopy arg (=10)                  number of copies per sample in
                                        counterfactual matrix
  --gAddTestMatrix                      add test matrix to counter-factual
                                        regression matrix
  --do_two_models_g_comp                get ITE using g-computation
  --g_comp_params0 arg                  parameters and definition of outcome
                                        predictor for treatment=0
  --g_comp_params1 arg                  parameters and definition of outcome
                                        predictor for treatment=1
  --do_nn_quasi_oracle                  get ITE using quasi-oracle
  --do_quasi_oracle                     get ITE using quasi-oracle
  --e_params arg                        Quasi-Oracle e-prediction params
                                        (propensity)
  --m_params arg                        Quasi-Oracle m-prediction params
                                        (outcome without explicit treatment)
  --t_params arg                        Quasi-Oracle t-prediction params (ITE)
  --t_script arg (=/nas1/UsersData/yaron/MR/Tools/quasi_oracle/PythonScripts/ite_predictor.py)
                                        Python script for t-prediction (ITE)
  --t_script_output arg                 output for Python script for
                                        t-predictions (ITE)
  --t_script_input arg (=ite_matrix)    input for Python script for
                                        t-predictions (ITE)
  --t_script_params arg                 parameters for python script for
                                        t-predictions (ITE)
  --do_oracle                           get ITE using an oracle
  --treatment_params arg                serialized treatment model file
  --extend_matrix                       add quadratic features to t-modeling
                                        matrix
  --preds_files_suffix arg              Siffix for predictions files
  --optimization_file arg               File for optimization of t-predictor
                                        parameters
  --summary_file arg                    File for summary of results
  --do_external                         read predictions from csv file
  --preds_file arg                      predictions file (untreated/treated
                                        pairs)
  --do_external_predictor               generate predictions from a
                                        MedPredictor
  --predictor_file arg                  predictor files (ITE from features
                                        without Treatment)
  --do_external_script                  generate predictions using a script
  --script arg                          external script to run
  --script_input arg                    script data input (--data ...)
  --script_params arg                   script parameters
  --script_output arg                   script prediction output (--preds ...)
  --do_cfr                              run counterfactual regression
  --script_dir arg (=.)                 script data directory
  --cfr_train_script arg (=/nas1/UsersData/yaron/MR/Projects/Shared/CausalEffects/CausalEffectScripts/cfrnet-master/cfr_net_train.py)
  --cfr_trans_script arg (=/nas1/UsersData/yaron/MR/Projects/Shared/CausalEffects/CausalEffectScripts/csv2cfr.py)
  --do_shap                             use Shapley values for ITE
  --do_ipw_shap                         use Shapley values on IPW-corrected
                                        model for ITE
  --do_weighted_nn_model                get ITE from propensity weighted NN
                                        model

The program uses ** and * to learn model(s) for evaluation individual treatment effecsts (ITE) and then applies the model(s) on *. The programs outputs include descrptive information in , as weill as tabular output in **. Each line in ** countains three tab-delimited columns -

Method-Name	True-ITE	Estimated-ITE

The true ITE is either generated from the generative model (if given, in .bin and .treatment.bin) or read from file (**) Methods currently implemented are:

Method	Description	Comment
do_true	true ITE from generative models
do_direct	Learn a regression model to directly evaluate ITE on trainMatrix and apply on *validationMatrix*	This is a debugging method as it assumes true ITE is known for the trainMatrix
do_model	Learn a naive model ƒ(x,T)→y , and evaluate ITE = ƒ(x,1) - ƒ(x,0)
do_nn_model	Same as do_model but using external learning/predictions scripts for model	Allows interfacing with TensorFlow
do_two_models	Learn two modelsƒ_T=1(x)→y and ƒ_T=0(x)→y and evaluate ITE = ƒ_T=1(x) - ƒ_T=0(x)
do_weighted	Use Inverse Propensity Weighting (IPW) to learn ƒ(x,T)→y , and evaluate ITE = ƒ(x,1) - ƒ(x,0)
do_weighted_nn_model	Same as do_weighted but using external learning/predictions scripts for model	Allows interfacing with TensorFlow
do_g_comp	Use "G-Computation" - create counter-factuals using a model, and then use them for learning a second model.	IPW optional for first model
do_two_models_g_comp	A combination of do_g_comp & do_two_models
do_quasi_oracle	Evalute ITE using Quasi-Oracle - e* and m* evaluated internally and ITE using an external script	ITE is evaluated using an external script to allow using TensorFlow NN
do_nn_quasi_oracle	Evalute ITE using Quasi-Oracle - e* and m* also evaluated using external scripts	Allows interfacing with TensorFlow on all stages
do_oracle	Similar to Quasi-Oracle, only using true e and m instead of estimated e* and m*	his is a debugging method as it assumes true e and m are known
do_external	Import ITE from file and generate out file
do_external_predictor	Read a MedPredictor object and apply on validationMatrix to generate ITE
do_external_script	Apply an external script to generate ITE on validationMatrix
do_cfr	Apply Uri Shalit's CounterFactual Regression Methods	Use downloaded scripts
do_shap	Use Shapley of naive model ƒ(x,T)→y values as estimators for ITE
do_ipw_shap	Use Shapley of IPW-learned model ƒ(x,T)→y values as estimators for ITE