Medial Code Documentation
Loading...
Searching...
No Matches
Public Member Functions
LightGBM::OrderedBin Class Referenceabstract

Interface for ordered bin data. efficient for construct histogram, especially for sparse bin There are 2 advantages by using ordered bin. More...

#include <bin.h>

Inheritance diagram for LightGBM::OrderedBin:
LightGBM::OrderedSparseBin< VAL_T >

Public Member Functions

virtual ~OrderedBin ()
 virtual destructor
 
virtual void Init (const char *used_indices, data_size_t num_leaves)=0
 Initialization logic.
 
virtual void ConstructHistogram (int leaf, const score_t *gradients, const score_t *hessians, HistogramBinEntry *out) const =0
 Construct histogram by using this bin Note: Unlike Bin, OrderedBin doesn't use ordered gradients and ordered hessians. Because it is hard to know the relative index in one leaf for sparse bin, since we skipped zero bins.
 
virtual void ConstructHistogram (int leaf, const score_t *gradients, HistogramBinEntry *out) const =0
 Construct histogram by using this bin Note: Unlike Bin, OrderedBin doesn't use ordered gradients and ordered hessians. Because it is hard to know the relative index in one leaf for sparse bin, since we skipped zero bins.
 
virtual void Split (int leaf, int right_leaf, const char *is_in_leaf, char mark)=0
 Split current bin, and perform re-order by leaf.
 
virtual data_size_t NonZeroCount (int leaf) const =0
 

Detailed Description

Interface for ordered bin data. efficient for construct histogram, especially for sparse bin There are 2 advantages by using ordered bin.

  1. group the data by leafs to improve the cache hit.
  2. only store the non-zero bin, which can speed up the histogram consturction for sparse features. However it brings additional cost: it need re-order the bins after every split, which will cost much for dense feature. So we only using ordered bin for sparse situations.

Member Function Documentation

◆ ConstructHistogram() [1/2]

virtual void LightGBM::OrderedBin::ConstructHistogram ( int  leaf,
const score_t gradients,
const score_t hessians,
HistogramBinEntry out 
) const
pure virtual

Construct histogram by using this bin Note: Unlike Bin, OrderedBin doesn't use ordered gradients and ordered hessians. Because it is hard to know the relative index in one leaf for sparse bin, since we skipped zero bins.

Parameters
leafUsing which leaf's data to construct
gradientsGradients, Note:non-oredered by leaf
hessiansHessians, Note:non-oredered by leaf
outOutput Result

Implemented in LightGBM::OrderedSparseBin< VAL_T >.

◆ ConstructHistogram() [2/2]

virtual void LightGBM::OrderedBin::ConstructHistogram ( int  leaf,
const score_t gradients,
HistogramBinEntry out 
) const
pure virtual

Construct histogram by using this bin Note: Unlike Bin, OrderedBin doesn't use ordered gradients and ordered hessians. Because it is hard to know the relative index in one leaf for sparse bin, since we skipped zero bins.

Parameters
leafUsing which leaf's data to construct
gradientsGradients, Note:non-oredered by leaf
outOutput Result

Implemented in LightGBM::OrderedSparseBin< VAL_T >.

◆ Init()

virtual void LightGBM::OrderedBin::Init ( const char *  used_indices,
data_size_t  num_leaves 
)
pure virtual

Initialization logic.

Parameters
used_indicesIf used_indices.size() == 0 means using all data, otherwise, used_indices[i] == true means i-th data is used (this logic was build for bagging logic)
num_leavesNumber of leaves on this iteration

◆ Split()

virtual void LightGBM::OrderedBin::Split ( int  leaf,
int  right_leaf,
const char *  is_in_leaf,
char  mark 
)
pure virtual

Split current bin, and perform re-order by leaf.

Parameters
leafUsing which leaf's to split
right_leafThe new leaf index after perform this split
is_in_leafis_in_leaf[i] == mark means the i-th data will be on left leaf after split
markis_in_leaf[i] == mark means the i-th data will be on left leaf after split

Implemented in LightGBM::OrderedSparseBin< VAL_T >.


The documentation for this class was generated from the following file: