The new DllAPITester

The DllAPITester is a testing tool for AlgoMarkers that contains several useful options:

Testing results on a sample set via the infrastructure and via the AlgoMarker
Testing via the AlgoMarker library in the infrastructure or via the .so compiled for it.
Generating json examples from data.
Testing on json examples.
Generating dictionaries for AlgoMarkers.
Testing also using the newer json requests and json responses. It is a must use tool whenever packing a new algomarker. In the following explanations we will assume one has an AlgoMarker with a model inside, a repository to test on, and a samples file to work with.

General App Parameters

The App is in ../MR/Libs/Internal/AlgoMarker/Linux/Release/DllAPITester , to compile simply compile the AlgoMarker directory (smake_rel). Major parameters:

parameter	comment
rep	repository to use
samples	samples file
model	the model file (typically the one in the AlgoMarker wrap)
amconfig	the algomarker (config) to work with
amlib	the actual .so to use when calling AM_API calls. Optional. Without it will use the AM_API as is currently in the infrastructure.
json_dict	A list of dictionaries to load into the AlgoMarker when loading it (prior to actual usage of loading data and getting results)
am_res_file	(optional) The predictions file of the AlgoMarker as generated by calling the AlgoMarker
direct_csv	(optional) The full feature matrix generated by running through the infrastructure
am_csv	(optional) The full feature matrix generated by running through the AlgoMarker
single	Run tests one by one rather than in batch. Slower. Some modes, such as generating json outputs work only in single mode.
out_jsons	if given will generate a file with a list of jsons (in a long array) that contain all the data needed to give a prediction. These can be used for direct tests in the AlgoAnalyzer.
in_jsons	if given will take input data from the given jsons.
jreq	input json request (could contain also data)
jresp	get output as a json response
create_jreq	generate a request for a given samples file (will need also jreq_defs, optionally the --add_data_to_jreq flag) and an output file in --jreq_out)

Score Compare: Testing results of an AlgoMarker vs. results directly from the infrastructure, and much more

This test is currently testing the prediction only (for single prediction models). Not testing other accompanied infomation that could be on top in json responses. It is the major test to be done, to run it: DllAPITester --rep --samples --model --amconfig This will so a batch run and test the whole samples, producing a summary report. In many cases you would need to add:

--json_dict : if you require additional dictionaries to be loaded.
--single : if you wish to do the test in single mode, one by one, mimicking how it is used by AlgoAnalyzer.
optional output files:
--amlib : point to an .so file (typically the one in the AlgoMarker library). If given , the run will use the API from the library, as would happen in real life usage. Without it the API as is currently in the infrastructure will be used.
--out_jsons : generating json examples out of this run.
--am_csv , --direct_csv : when needing the actual feature matrix via the AlgoMarker or via the infrastructure (or both).

--am_res_file : the predictions as generated by the AlgoMarker. An example of a full run example with some options on:

DllAPITester --rep /home/Repositories/KPNW/kpnw_apr20/kpnw.repository --samples ./nwp_100.samples --model /nas1/Products/COVID19/QA_Versions/dev_20200608/COVID19-Comp-Flag-2020-05-04.model --amconfig /nas1/Products/COVID19/QA_Versions/dev_20200608/COVID19-Comp-Flag-2020-05-04.amconfig --json_dict /nas1/Work/AlgoMarkers/COVID19/Avi/DIAGNOSIS.txt,/nas1/Work/AlgoMarkers/COVID19/Avi/Drug.txt,/nas1/Work/AlgoMarkers/COVID19/Avi/ADMISSION.txt --single --am_res am.preds --out_jsons n100.jsons

This example will test the covid predictor on some samples, on the requested repository, but also load some dictionaries prior to that, do the run in single mode, and output both the predictions into a file, and a file containing jsons for all the samples.

Generating json outputs

To do that simply run a score compare run with the out_jsons file requested. You need to be in single mode for this option to work. You can set up some parameters for the creation (optional):

--accountId : accountId for output json
--calculator : calculator name for output json
--units : list of units to add for requested signals (example: BMI,kg/m^2,Weight,kg,Height,cm,Pack_Years,pack*years,Smoking_Intensity,cigs/day,Smoking_Quit_Date,date,Smoking_Duration,years)
--scoreOnDate : flag, use to have that field in output jsons. The jsons will be created with a request_id field "req_id" of the form : req__ , coding the actual time for prediction that it was made for.

Running on input json examples

Sometimes it is useful to generate a run on a given file of jsons, in order to recreate a bug or a prediction on specific cases. This can be done by simply using the --in_jsons option (if given , then you don't need a repository and samples). Currently this option runs over the jsons in the given file one by one (as in single mode), and prints the score to each. It doesn't compare it to anything. You can generate a preds file for all the given jsons using the --preds_json option. You can also generate the matrix using the --am_csv option. However this currently will simply write the matrix for the last json given in the file (could be improved in future).

Generating Dictionaries for AlgoMarkers

AlgoMarkers with categorial signals may need accompanying dictionaries to adjust them to the actual dictionaries in the AlgoMarker. Those dictionaries can be created in anyway and manner as long as they are in the correct format and meaning at the end. The DllAPITester allows generating such dictionaries for a given AlgoMarker and repository. To do that one must supply a config file. The config file is composed of tab delimited lines with 3 fields: IN means : the value is an input categorial value OUT means : an output value, a valid value to send to the AlgoMarker. There are 2 possible formates for the output, for the one used by the AlgoAnalyzer use --simple_dict option. Example run:

DllAPITester --rep /home/Repositories/KPNW/kpnw_apr20/kpnw.repository --dicts_config ../../../NWP/covid_testing/dev_20200504/full_dicts/dicts.config --out_json_dict ./dict.json --simple_dict

Working with input json requests and output json responses

The newer APIs in the library allow for a new way of generating outputs. It is possible now to load data from a rep+samples or in_jsons (As shown above), and then use a json request (–jreq option) for inputs, and get out a json response (–jresp). More than that: requests can contain actual data for the patients and if that is used then one can skip loading the data from a repository and samples or in_jsons. To use these options simply do a usual run, but then add a valid --jreq and an output --jresp . The DllAPITester contains also options to generate such json requests with/without data. (TBD: detailed description of the json request input options, and the structure of the output json response).