Sanity test experiment for debuging

Date: 18.11.2019It looks like all our methods are tuned (without a major bug). The results of all the methods look reasonable.Coby passed over 8 samples (out of 140 available) for diabetes predictor (grouped by signals into 11 groups from 69 features) and scored them from 1-4 (the higher the better).We can see the distribution of the scores for each method and the average score:

Explainer_name	1	2	3	4	<EMPTY>	Average Score
Tree	0	1	4	3	132	3.25
missing_shap	0	2	2	4	132	3.25
LIME_GAN	0	2	3	3	132	3.125
SHAP_GAN	0	2	3	3	132	3.125
Tree_with_cov	0	3	1	4	132	3.125
knn_with_th	2	2	1	3	132	2.625
knn	0	5	2	1	132	2.5

* Tree = regular tree shapley implementation * missing_shap - shapley algorithm with 500 random masks. but instead of using GAN or gibbs to generate samples, we use additional predictor (xgboost regression model) to predict the diabetes model scores when seeing random masks of missing values. the grouping is calculated internally. * LIME_GAN - LIME algorithm with GAN - sampling random masks and fitting a model - the grouping is calculated internally. * SHAP_GAN - Shapley algorithm with GAN. sampling random masks and not an exact calculation - the grouping is calculated internally.* Tree_with_cov - the tree implementation with covariance fix * knn_with_th - the KNN algorithm with threshold of 5% to explainknn - the KNN algorithm without threshold