Medial Code Documentation
Loading...
Searching...
No Matches
Data Structures | Typedefs | Enumerations | Functions | Variables
LightGBM Namespace Reference

desc and descl2 fields must be written in reStructuredText format More...

Data Structures

class  Application
 The main entrance of LightGBM. this application has two tasks: Train and Predict. Train task will train a new model Predict task will predict the scores of test data using existing model, and save the score to disk. More...
 
class  ArrayArgs
 Contains some operation for a array, e.g. ArgMax, TopK. More...
 
class  AUCMetric
 Auc Metric for binary classification task. More...
 
class  Bin
 Interface for bin data. This class will store bin data for one feature. unlike OrderedBin, this class will store data by original order. Note that it may cause cache misses when construct histogram, but it doesn't need to re-order operation, So it will be faster than OrderedBin for dense feature. More...
 
class  BinaryErrorMetric
 Error rate metric for binary classification task. More...
 
class  BinaryLogloss
 Objective function for binary classification. More...
 
class  BinaryLoglossMetric
 Log loss metric for binary classification task. More...
 
class  BinaryMetric
 Metric for binary classification task. Use static class "PointWiseLossCalculator" to calculate loss point-wise. More...
 
class  BinIterator
 Iterator for one bin column. More...
 
class  BinMapper
 This class used to convert feature values into bin, and store some meta information for bin. More...
 
class  Booster
 
class  Boosting
 The interface for Boosting. More...
 
class  BruckMap
 The network structure for all_gather. More...
 
struct  Config
 
class  CrossEntropy
 Objective function for cross-entropy (with optional linear weights) More...
 
class  CrossEntropyLambda
 Objective function for alternative parameterization of cross-entropy (see top of file for explanation) More...
 
class  CrossEntropyLambdaMetric
 
class  CrossEntropyMetric
 
class  CSVParser
 
class  DART
 DART algorithm implementation. including Training, prediction, bagging. More...
 
class  DataParallelTreeLearner
 Data parallel learning algorithm. Workers use local data to construct histograms locally, then sync up global histograms. It is recommonded used when data is large or #feature is small. More...
 
class  DataPartition
 DataPartition is used to store the the partition of data on tree. More...
 
class  Dataset
 The main class of data set, which are used to traning or validation. More...
 
class  DatasetLoader
 
class  DCGCalculator
 Static class, used to calculate DCG score. More...
 
class  Dense4bitsBin
 
class  Dense4bitsBinIterator
 
class  DenseBin
 Used to store bins for dense feature Use template to reduce memory cost. More...
 
class  DenseBinIterator
 
class  FairLossMetric
 Fair loss for regression task. More...
 
class  FeatureGroup
 Using to store data and providing some operations on one feature group. More...
 
class  FeatureHistogram
 FeatureHistogram is used to construct and store a histogram for a feature. More...
 
class  FeatureMetainfo
 
class  FeatureParallelTreeLearner
 Feature parallel learning algorithm. Different machine will find best split on different features, then sync global best split It is recommonded used when data is small or #feature is large. More...
 
class  GammaDevianceMetric
 
class  GammaMetric
 
class  GBDT
 GBDT algorithm implementation. including Training, prediction, bagging. More...
 
class  GBDT_Accessor
 
class  GBDTBase
 
class  GOSS
 
class  GPUTreeLearner
 
struct  HistogramBinEntry
 Store data for one histogram bin. More...
 
class  HistogramPool
 
class  HuberLossMetric
 Huber loss for regression task. More...
 
class  KullbackLeiblerDivergence
 
class  L1Metric
 L1 loss for regression task. More...
 
class  L2Metric
 L2 loss for regression task. More...
 
class  LambdarankNDCG
 Objective function for Lambdrank with NDCG. More...
 
class  LeafSplits
 used to find split candidates for a leaf More...
 
class  LibSVMParser
 
struct  LightSplitInfo
 
class  Linkers
 An network basic communication warpper. Will warp low level communication methods, e.g. mpi, socket and so on. This class will wrap all linkers to other machines if needs. More...
 
struct  LocalFile
 
class  Log
 A static Log class. More...
 
class  MAPEMetric
 Mape regression loss for regression task. More...
 
class  MapMetric
 
class  MemApp
 
class  Metadata
 This class is used to store some meta(non-feature) data for training data, e.g. labels, weights, initial scores, qurey level informations. More...
 
class  Metric
 The interface of metric. Metric is used to calculate metric result. More...
 
class  MulticlassMetric
 Metric for multiclass task. Use static class "PointWiseLossCalculator" to calculate loss point-wise. More...
 
class  MulticlassOVA
 Objective function for multiclass classification, use one-vs-all binary objective function. More...
 
class  MulticlassSoftmax
 Objective function for multiclass classification, use softmax as objective functions. More...
 
class  MultiErrorMetric
 L2 loss for multiclass task. More...
 
class  MultiSoftmaxLoglossMetric
 Logloss for multiclass task. More...
 
class  NDCGMetric
 
class  Network
 A static class that contains some collective communication algorithm. More...
 
class  ObjectiveFunction
 The interface of Objective Function. More...
 
class  OrderedBin
 Interface for ordered bin data. efficient for construct histogram, especially for sparse bin There are 2 advantages by using ordered bin. More...
 
class  OrderedSparseBin
 Interface for ordered bin data. efficient for construct histogram, especially for sparse bin There are 2 advantages by using ordered bin. More...
 
struct  ParameterAlias
 
class  Parser
 Interface for Parser. More...
 
class  PipelineReader
 A pipeline file reader, use 2 threads, one read block from file, the other process the block. More...
 
class  PoissonMetric
 Poisson regression loss for regression task. More...
 
struct  PredictionEarlyStopConfig
 
struct  PredictionEarlyStopInstance
 
class  Predictor
 Used to predict data with input model. More...
 
class  QuantileMetric
 L2 loss for regression task. More...
 
class  Random
 A wrapper for random generator. More...
 
class  RecursiveHalvingMap
 Network structure for recursive halving algorithm. More...
 
class  RegressionFairLoss
 
class  RegressionGammaLoss
 Objective function for Gamma regression. More...
 
class  RegressionHuberLoss
 Huber regression loss. More...
 
class  RegressionL1loss
 L1 regression loss. More...
 
class  RegressionL2loss
 Objective function for regression. More...
 
class  RegressionMAPELOSS
 Mape Regression Loss. More...
 
class  RegressionMetric
 Metric for regression task. Use static class "PointWiseLossCalculator" to calculate loss point-wise. More...
 
class  RegressionPoissonLoss
 Objective function for Poisson regression. More...
 
class  RegressionQuantileloss
 
class  RegressionTweedieLoss
 Objective function for Tweedie regression. More...
 
class  RF
 Rondom Forest implementation. More...
 
class  RMSEMetric
 RMSE loss for regression task. More...
 
class  ScoreUpdater
 Used to store and update score for data. More...
 
class  SerialTreeLearner
 Used for learning a tree by single machine. More...
 
class  SparseBin
 
class  SparseBinIterator
 
struct  SplitInfo
 Used to store some information for gain split point. More...
 
class  TextReader
 Read text data from file. More...
 
class  Threading
 
class  Tree
 Tree model. More...
 
class  TreeLearner
 Interface for tree learner. More...
 
class  TSVParser
 
class  TweedieMetric
 
struct  VirtualFileReader
 An interface for reading files into buffers. More...
 
struct  VirtualFileWriter
 An interface for writing files from buffers. More...
 
class  VotingParallelTreeLearner
 Voting based data parallel learning algorithm. Like data parallel, but not aggregate histograms for all features. Here using voting to reduce features, and only aggregate histograms for selected features. When data is large and #feature is large, you can use this to have better speed-up. More...
 

Typedefs

typedef int32_t data_size_t
 Type of data size, it is better to use signed type.
 
typedef float score_t
 Type of score, and gradients.
 
typedef float label_t
 Type of metadata, include weight and label.
 
typedef int32_t comm_size_t
 
using PredictFunction = std::function< void(const std::vector< std::pair< int, double > > &, double *output)>
 
typedef void(* ReduceFunction) (const char *input, char *output, int type_size, comm_size_t array_size)
 
typedef void(* ReduceScatterFunction) (char *input, comm_size_t input_size, int type_size, const comm_size_t *block_start, const comm_size_t *block_len, int num_block, char *output, comm_size_t output_size, const ReduceFunction &reducer)
 
typedef void(* AllgatherFunction) (char *input, comm_size_t input_size, const comm_size_t *block_start, const comm_size_t *block_len, int num_block, char *output, comm_size_t output_size)
 

Enumerations

enum  BinType { NumericalBin , CategoricalBin }
 
enum  MissingType { None , Zero , NaN }
 
enum  TaskType { kTrain , kPredict , kConvertModel , KRefitTree }
 Types of tasks.
 
enum  RecursiveHalvingNodeType { Normal , GroupLeader , Other }
 node type on recursive halving algorithm When number of machines is not power of 2, need group machines into power of 2 group. And we can let each group has at most 2 machines. if the group only has 1 machine. this machine is the normal node if the group has 2 machines, this group will have two type of nodes, one is the leader. leader will represent this group and communication with others.
 
enum class  LogLevel : int { Fatal = -1 , Warning = 0 , Info = 1 , Debug = 2 }
 
enum  DataType { INVALID , CSV , TSV , LIBSVM }
 

Functions

LIGHTGBM_EXPORT PredictionEarlyStopInstance CreatePredictionEarlyStopInstance (const std::string &type, const PredictionEarlyStopConfig &config)
 Create an early stopping algorithm of type type, with given round_period and margin threshold.
 
std::string GetBoostingTypeFromModelFile (const char *filename)
 
double ObtainAutomaticInitialScore (const ObjectiveFunction *fobj, int class_id)
 
int LGBM_APIHandleException (const std::exception &ex)
 
int LGBM_APIHandleException (const std::string &ex)
 
bool NeedFilter (const std::vector< int > &cnt_in_bin, int total_cnt, int filter_cnt, BinType bin_type)
 
std::vector< double > GreedyFindBin (const double *distinct_values, const int *counts, int num_distinct_values, int max_bin, size_t total_cnt, int min_data_in_bin)
 
std::vector< double > FindBinWithZeroAsOneBin (const double *distinct_values, const int *counts, int num_distinct_values, int max_bin, size_t total_sample_cnt, int min_data_in_bin)
 
void GetBoostingType (const std::unordered_map< std::string, std::string > &params, std::string *boosting)
 
void GetObjectiveType (const std::unordered_map< std::string, std::string > &params, std::string *objective)
 
void GetMetricType (const std::unordered_map< std::string, std::string > &params, std::vector< std::string > *metric)
 
void GetTaskType (const std::unordered_map< std::string, std::string > &params, TaskType *task)
 
void GetDeviceType (const std::unordered_map< std::string, std::string > &params, std::string *device_type)
 
void GetTreeLearnerType (const std::unordered_map< std::string, std::string > &params, std::string *tree_learner)
 
bool CheckMultiClassObjective (const std::string &objective)
 
std::vector< std::vector< int > > NoGroup (const std::vector< int > &used_features)
 
int GetConfilctCount (const std::vector< bool > &mark, const int *indices, int num_indices, int max_cnt)
 
void MarkUsed (std::vector< bool > &mark, const int *indices, int num_indices)
 
std::vector< std::vector< int > > FindGroups (const std::vector< std::unique_ptr< BinMapper > > &bin_mappers, const std::vector< int > &find_order, int **sample_indices, const int *num_per_col, size_t total_sample_cnt, data_size_t max_error_cnt, data_size_t filter_cnt, data_size_t num_data, bool is_use_gpu)
 
std::vector< std::vector< int > > FastFeatureBundling (std::vector< std::unique_ptr< BinMapper > > &bin_mappers, int **sample_indices, const int *num_per_col, size_t total_sample_cnt, const std::vector< int > &used_features, double max_conflict_rate, data_size_t num_data, data_size_t min_data, double sparse_threshold, bool is_enable_sparse, bool is_use_gpu)
 
void GetStatistic (const char *str, int *comma_cnt, int *tab_cnt, int *colon_cnt)
 
int GetLabelIdxForLibsvm (std::string &str, int num_features, int label_idx)
 
int GetLabelIdxForTSV (std::string &str, int num_features, int label_idx)
 
int GetLabelIdxForCSV (std::string &str, int num_features, int label_idx)
 
void getline (std::stringstream &ss, std::string &line, const VirtualFileReader *reader, std::vector< char > &buffer, size_t buffer_size)
 
void SyncUpGlobalBestSplit (char *input_buffer_, char *output_buffer_, SplitInfo *smaller_best_split, SplitInfo *larger_best_split, int max_cat_threshold)
 
std::function< std::vector< double >(int row_idx)> RowFunctionFromDenseMatric (const void *data, int num_row, int num_col, int data_type, int is_row_major)
 
std::function< std::vector< std::pair< int, double > >(int row_idx)> RowPairFunctionFromDenseMatric (const void *data, int num_row, int num_col, int data_type, int is_row_major)
 

Variables

const int kDefaultNumLeaves = 31
 
const score_t kMinScore = -std::numeric_limits<score_t>::infinity()
 
const score_t kEpsilon = 1e-15f
 
const double kZeroThreshold = 1e-35f
 
const std::string kModelVersion = "v2"
 
const std::string kHdfsProto = "hdfs://"
 
const size_t kNumFastIndex = 64
 

Detailed Description

desc and descl2 fields must be written in reStructuredText format

This file is auto generated by LightGBM\helpers\parameter_generator.py from LightGBM\include\LightGBM\config.h file.