Sanity test experiment for debuging
Date: 18.11.2019It looks like all our methods are tuned (without a major bug). The results of all the methods look reasonable.Coby passed over 8 samples (out of 140 available) for diabetes predictor (grouped by signals into 11 groups from 69 features) and scored them from 1-4 (the higher the better).We can see the distribution of the scores for each method and the average score:
| Explainer_name | 1 | 2 | 3 | 4 | <EMPTY> | Average Score |
|---|---|---|---|---|---|---|
| Tree | 0 | 1 | 4 | 3 | 132 | 3.25 |
| missing_shap | 0 | 2 | 2 | 4 | 132 | 3.25 |
| LIME_GAN | 0 | 2 | 3 | 3 | 132 | 3.125 |
| SHAP_GAN | 0 | 2 | 3 | 3 | 132 | 3.125 |
| Tree_with_cov | 0 | 3 | 1 | 4 | 132 | 3.125 |
| knn_with_th | 2 | 2 | 1 | 3 | 132 | 2.625 |
| knn | 0 | 5 | 2 | 1 | 132 | 2.5 |
* Tree = regular tree shapley implementation * missing_shap - shapley algorithm with 500 random masks. but instead of using GAN or gibbs to generate samples, we use additional predictor (xgboost regression model) to predict the diabetes model scores when seeing random masks of missing values. the grouping is calculated internally. * LIME_GAN - LIME algorithm with GAN - sampling random masks and fitting a model - the grouping is calculated internally. * SHAP_GAN - Shapley algorithm with GAN. sampling random masks and not an exact calculation - the grouping is calculated internally.* Tree_with_cov - the tree implementation with covariance fix * knn_with_th - the KNN algorithm with threshold of 5% to explainknn - the KNN algorithm without threshold