Interface for ordered bin data. efficient for construct histogram, especially for sparse bin There are 2 advantages by using ordered bin. More...

#include <ordered_sparse_bin.hpp>

Inheritance diagram for LightGBM::OrderedSparseBin< VAL_T >:

Data Structures
struct	SparsePair
	Pair to store one bin entry. More...

Public Member Functions
	OrderedSparseBin (const SparseBin< VAL_T > *bin_data)

void	Init (const char *used_idices, int num_leaves) override

void	ConstructHistogram (int leaf, const score_t gradient, const score_t hessian, HistogramBinEntry *out) const override
	Construct histogram by using this bin Note: Unlike Bin, OrderedBin doesn't use ordered gradients and ordered hessians. Because it is hard to know the relative index in one leaf for sparse bin, since we skipped zero bins.

void	ConstructHistogram (int leaf, const score_t gradient, HistogramBinEntry out) const override
	Construct histogram by using this bin Note: Unlike Bin, OrderedBin doesn't use ordered gradients and ordered hessians. Because it is hard to know the relative index in one leaf for sparse bin, since we skipped zero bins.

void	Split (int leaf, int right_leaf, const char *is_in_leaf, char mark) override
	Split current bin, and perform re-order by leaf.

data_size_t	NonZeroCount (int leaf) const override

OrderedSparseBin< VAL_T > &	operator= (const OrderedSparseBin< VAL_T > &)=delete
	Disable copy.

	OrderedSparseBin (const OrderedSparseBin< VAL_T > &)=delete
	Disable copy.

Public Member Functions inherited from LightGBM::OrderedBin
virtual	~OrderedBin ()
	virtual destructor

virtual void	Init (const char *used_indices, data_size_t num_leaves)=0
	Initialization logic.

Detailed Description

template<typename VAL_T>
class LightGBM::OrderedSparseBin< VAL_T >

Interface for ordered bin data. efficient for construct histogram, especially for sparse bin There are 2 advantages by using ordered bin.

group the data by leafs to improve the cache hit.
only store the non-zero bin, which can speed up the histogram consturction for sparse features. However it brings additional cost: it need re-order the bins after every split, which will cost much for dense feature. So we only using ordered bin for sparse situations.

Member Function Documentation

◆ ConstructHistogram() [1/2]

template<typename VAL_T >

void LightGBM::OrderedSparseBin< VAL_T >::ConstructHistogram	(	int	leaf,
		const score_t *	gradients,
		const score_t *	hessians,
		HistogramBinEntry *	out
	)		const

inlineoverridevirtual

Construct histogram by using this bin Note: Unlike Bin, OrderedBin doesn't use ordered gradients and ordered hessians. Because it is hard to know the relative index in one leaf for sparse bin, since we skipped zero bins.

Parameters

leaf	Using which leaf's data to construct
gradients	Gradients, Note:non-oredered by leaf
hessians	Hessians, Note:non-oredered by leaf
out	Output Result

Implements LightGBM::OrderedBin.

◆ ConstructHistogram() [2/2]

template<typename VAL_T >

void LightGBM::OrderedSparseBin< VAL_T >::ConstructHistogram	(	int	leaf,
		const score_t *	gradients,
		HistogramBinEntry *	out
	)		const

inlineoverridevirtual

Construct histogram by using this bin Note: Unlike Bin, OrderedBin doesn't use ordered gradients and ordered hessians. Because it is hard to know the relative index in one leaf for sparse bin, since we skipped zero bins.

Parameters

leaf	Using which leaf's data to construct
gradients	Gradients, Note:non-oredered by leaf
out	Output Result

Implements LightGBM::OrderedBin.

◆ NonZeroCount()

template<typename VAL_T >

data_size_t LightGBM::OrderedSparseBin< VAL_T >::NonZeroCount ( int leaf ) const

inlineoverridevirtual

Implements LightGBM::OrderedBin.

◆ Split()

template<typename VAL_T >

void LightGBM::OrderedSparseBin< VAL_T >::Split	(	int	leaf,
		int	right_leaf,
		const char *	is_in_leaf,
		char	mark
	)

inlineoverridevirtual

Split current bin, and perform re-order by leaf.

Parameters

leaf	Using which leaf's to split
right_leaf	The new leaf index after perform this split
is_in_leaf	is_in_leaf[i] == mark means the i-th data will be on left leaf after split
mark	is_in_leaf[i] == mark means the i-th data will be on left leaf after split

Implements LightGBM::OrderedBin.

The documentation for this class was generated from the following file:

External/LightGBM_2.2.3/LightGBM-2.2.3/src/io/ordered_sparse_bin.hpp

Data Structures

Public Member Functions

Detailed Description

Member Function Documentation

◆ ConstructHistogram() [1/2]

◆ ConstructHistogram() [2/2]

◆ NonZeroCount()

◆ Split()