Medial Code Documentation
Loading...
Searching...
No Matches
Data Structures | Namespaces | Macros | Enumerations | Functions | Variables
MedAlgo.h File Reference

MedAlgo - APIs to different algorithms: Linear Models, RF, GBM, KNN, and more. More...

#include <Logger/Logger/Logger.h>
#include <MedUtils/MedUtils/MedUtils.h>
#include <MedStat/MedStat/MedStat.h>
#include <MedFeat/MedFeat/MedFeat.h>
#include <QRF/QRF/QRF.h>
#include <micNet/micNet/micNet.h>
#include <string.h>
#include <limits.h>
#include <MedProcessTools/MedProcessTools/MedProcessUtils.h>
#include <SerializableObject/SerializableObject/SerializableObject.h>
#include <TQRF/TQRF/TQRF.h>
#include "svm.h"
#include <unordered_map>
#include <random>
#include <map>
#include <string>

Go to the source code of this file.

Data Structures

class  MedPredictor
 Base Interface for predictor. More...
 

Namespaces

namespace  medial
 medial namespace for function
 
namespace  medial::models
 models namespace
 
namespace  medial::process
 process namespace
 

Macros

#define NEW_COMPLIER   false
 

Enumerations

enum  MedPredictorTypes {
  MODEL_LINEAR_MODEL = 0 , MODEL_QRF = 1 , MODEL_KNN = 3 , MODEL_BP = 4 ,
  MODEL_MARS = 5 , MODEL_GD_LINEAR = 6 , MODEL_MULTI_CLASS = 7 , MODEL_XGB = 8 ,
  MODEL_LASSO = 9 , MODEL_MIC_NET = 10 , MODEL_BOOSTER = 11 , MODEL_DEEP_BIT = 12 ,
  MODEL_LIGHTGBM = 13 , MODEL_SPECIFIC_GROUPS_MODELS = 14 , MODEL_SVM = 15 , MODEL_LINEAR_SGD = 16 ,
  MODEL_VW = 17 , MODEL_TQRF = 18 , MODEL_BART = 19 , MODEL_EXTERNAL_NN = 20 ,
  MODEL_SIMPLE_ENSEMBLE = 21 , MODEL_BY_MISSING_VALUES_SUBSET = 22 , MODEL_LAST
}
 

Functions

MedPredictorTypes predictor_name_to_type (const string &model_name)
 Maping from model name in string to enum MedPredictorTypes.
 
int KMeans (MedMat< float > &x, int K, MedMat< float > &centers, vector< int > &clusters, MedMat< float > &dists)
 K-Means: x is input matrix(each row is sample N*M).
 
int KMeans (MedMat< float > &x, int K, int max_iter, MedMat< float > &centers, vector< int > &clusters, MedMat< float > &dists)
 K-Means: x is input matrix(each row is sample N*M).
 
int KMeans (float *x, int nrows, int ncols, int K, float *centers, int *clusters, float *dists)
 K-Means: x is input matrix(each row is sample N*M).
 
int KMeans (float *x, int nrows, int ncols, int K, int max_iter, float *centers, int *clusters, float *dists, bool verbose_print=true)
 K-Means: x is input matrix(each row is sample N*M).
 
int MedPCA (MedMat< float > &x, MedMat< float > &pca_base, vector< float > &varsum)
 given a matrix, returns the base PCA matrix and the cummulative relative variance explained by them.
 
int MedPCA_project (MedMat< float > &x, MedMat< float > &pca_base, int dim, MedMat< float > &projected)
 returns the projection of the pca base on the first dim dimensions.
 
string medial::models::getParamsInfraModel (void *model)
 returns string to create model with init_string. void * is MedPredictor
 
void * medial::models::copyInfraModel (void *model, bool delete_old=true)
 returns MedPredictor *, a clone copy of given model (params without learned data). if delete_old is true will free old given model
 
void medial::models::initInfraModel (void *&model)
 initialize model which is MedPredictor by copying it's parameters to new address and freeing old one
 
void medial::models::learnInfraModel (void *model, const vector< vector< float > > &xTrain, vector< float > &y, vector< float > &weights)
 run Learn on the MedPredictor - wrapper api
 
vector< float > medial::models::predictInfraModel (void *model, const vector< vector< float > > &xTest)
 run predict on the MedPredictor - wrapper api
 
void medial::models::get_pids_cv (MedPredictor *pred, MedFeatures &matrix, int nFolds, mt19937 &generator, vector< float > &preds)
 run cross validation where each pid is in diffrent fold and saves the preds.
 
void medial::models::get_cv (MedPredictor *pred, MedFeatures &matrix, int nFolds, mt19937 &generator, vector< float > &preds)
 run cross validation where each samples can be in diffrent fold and saves the preds.
 
void medial::process::compare_populations (const MedFeatures &population1, const MedFeatures &population2, const string &name1, const string &name2, const string &output_file, const string &predictor_type="", const string &predictor_init="", int nfolds=5, int max_learn=0)
 compares two matrixes populations.
 

Variables

unordered_map< int, string > predictor_type_to_name
 Maping from predictor enum type MedPredictorTypes to model name in string.
 

Detailed Description

MedAlgo - APIs to different algorithms: Linear Models, RF, GBM, KNN, and more.

Enumeration Type Documentation

◆ MedPredictorTypes

Enumerator
MODEL_LINEAR_MODEL 

to_use:"linear_model" Linear Model - creates MedLM

MODEL_QRF 

to_use:"qrf" Q-Random-Forest - creates MedQRF

MODEL_KNN 

to_use:"knn" K Nearest Neighbour - creates MedKNN

MODEL_BP 

to_use:"BP" Neural Network Back Propagation - creates MedBP

MODEL_MARS 

to_use:"mars" Multivariate Adaptive Regression Splines - creates MedMars

MODEL_GD_LINEAR 

to_use:"gdlm" Gradient Descent/Full solution ridge - creates MedGDLM

MODEL_MULTI_CLASS 

to_use:"multi_class" general one vs. all multi class extention - creates MedMultiClass

MODEL_XGB 

to_use:"xgb" XGBoost - creates MedXGB

MODEL_LASSO 

to_use:"lasso" Lasso model - creates MedLasso

MODEL_MIC_NET 

to_use:"micNet" Home brew Neural Net implementation (Allows deep learning) - creates MedMicNet

MODEL_BOOSTER 

to_use:"booster" general booster (meta algorithm) - creates MedBooster

MODEL_DEEP_BIT 

to_use:"deep_bit" Nir\'s DeepBit method - creates MedDeepBit

MODEL_LIGHTGBM 

to_use:"lightgbm" the celebrated LightGBM algorithm - creates MedLightGBM

MODEL_SPECIFIC_GROUPS_MODELS 

to_use:"multi_models" spliting model by specific value (for example age-range) and train diffretn model for each bin - creates MedSpecificGroupModels

MODEL_SVM 

to_use:"svm" Svm model - creates MedSvm

MODEL_LINEAR_SGD 

to_use:"linear_sgd" linear model using our customized SGD - creates MedLinearModel

MODEL_VW 

to_use:"vw" VowpalWabbit yahoo reasearch library - creates MedVW

MODEL_TQRF 

to_use:"tqrf" TQRF model - creates MedTQRF

MODEL_BART 

to_use:"bart" MedBART model using BART

MODEL_EXTERNAL_NN 

to_use: "external_nn" , initialize a neural net using a layers file. creates MedExternalNN

MODEL_SIMPLE_ENSEMBLE 

to_use: "simple_ensemble" , give 1 or more models to train, and ensemble them with given weights from the user. creates MedSimpleEnsemble

MODEL_BY_MISSING_VALUES_SUBSET 

to_use: "by_missing_value_subset", choosed MedPredictor on subset of the features based on missing values. choose best fit - creates MedPredictorsByMissingValues.

Function Documentation

◆ KMeans() [1/4]

int KMeans ( float *  x,
int  nrows,
int  ncols,
int  K,
float *  centers,
int *  clusters,
float *  dists 
)

K-Means: x is input matrix(each row is sample N*M).

K- number of clusters, centers - output centroids of clusters(K*M) clusters - output for each sample the cluster number from 0 to K-1(N*1). dists - output of distance for each sample form each cluster(N*K)

◆ KMeans() [2/4]

int KMeans ( float *  x,
int  nrows,
int  ncols,
int  K,
int  max_iter,
float *  centers,
int *  clusters,
float *  dists,
bool  verbose_print = true 
)

K-Means: x is input matrix(each row is sample N*M).

K- number of clusters, centers - output centroids of clusters(K*M) clusters - output for each sample the cluster number from 0 to K-1(N*1). dists - output of distance for each sample form each cluster(N*K)

◆ KMeans() [3/4]

int KMeans ( MedMat< float > &  x,
int  K,
int  max_iter,
MedMat< float > &  centers,
vector< int > &  clusters,
MedMat< float > &  dists 
)

K-Means: x is input matrix(each row is sample N*M).

K- number of clusters, centers - output centroids of clusters(K*M) clusters - output for each sample the cluster number from 0 to K-1(N*1). dists - output of distance for each sample form each cluster(N*K)

◆ KMeans() [4/4]

int KMeans ( MedMat< float > &  x,
int  K,
MedMat< float > &  centers,
vector< int > &  clusters,
MedMat< float > &  dists 
)

K-Means: x is input matrix(each row is sample N*M).

K- number of clusters, centers - output centroids of clusters(K*M) clusters - output for each sample the cluster number from 0 to K-1(N*1). dists - output of distance for each sample form each cluster(N*K)

◆ MedPCA()

int MedPCA ( MedMat< float > &  x,
MedMat< float > &  pca_base,
vector< float > &  varsum 
)

given a matrix, returns the base PCA matrix and the cummulative relative variance explained by them.

it is highly recommended to normalize the input matrix x before calling.