Used to store bins for dense feature Use template to reduce memory cost. More...

#include <dense_bin.hpp>

Inheritance diagram for LightGBM::DenseBin< VAL_T >:

Public Member Functions
	DenseBin (data_size_t num_data)

void	Push (int, data_size_t idx, uint32_t value) override
	Push one record \pram tid Thread id.

void	ReSize (data_size_t num_data) override

BinIterator *	GetIterator (uint32_t min_bin, uint32_t max_bin, uint32_t default_bin) const override
	Get bin iterator of this bin for specific feature.

void	ConstructHistogram (const data_size_t data_indices, data_size_t num_data, const score_t ordered_gradients, const score_t ordered_hessians, HistogramBinEntry out) const override
	Construct histogram of this feature, Note: We use ordered_gradients and ordered_hessians to improve cache hit chance The naive solution is using gradients[data_indices[i]] for data_indices[i] to get gradients, which is not cache friendly, since the access of memory is not continuous. ordered_gradients and ordered_hessians are preprocessed, and they are re-ordered by data_indices. Ordered_gradients[i] is aligned with data_indices[i]'s gradients (same for ordered_hessians).

void	ConstructHistogram (data_size_t num_data, const score_t ordered_gradients, const score_t ordered_hessians, HistogramBinEntry *out) const override

void	ConstructHistogram (const data_size_t data_indices, data_size_t num_data, const score_t ordered_gradients, HistogramBinEntry *out) const override
	Construct histogram of this feature, Note: We use ordered_gradients and ordered_hessians to improve cache hit chance The naive solution is using gradients[data_indices[i]] for data_indices[i] to get gradients, which is not cache friendly, since the access of memory is not continuous. ordered_gradients and ordered_hessians are preprocessed, and they are re-ordered by data_indices. Ordered_gradients[i] is aligned with data_indices[i]'s gradients (same for ordered_hessians).

void	ConstructHistogram (data_size_t num_data, const score_t ordered_gradients, HistogramBinEntry out) const override

virtual data_size_t	Split (uint32_t min_bin, uint32_t max_bin, uint32_t default_bin, MissingType missing_type, bool default_left, uint32_t threshold, data_size_t data_indices, data_size_t num_data, data_size_t lte_indices, data_size_t *gt_indices) const override
	Split data according to threshold, if bin <= threshold, will put into left(lte_indices), else put into right(gt_indices)

virtual data_size_t	SplitCategorical (uint32_t min_bin, uint32_t max_bin, uint32_t default_bin, const uint32_t threshold, int num_threahold, data_size_t data_indices, data_size_t num_data, data_size_t lte_indices, data_size_t gt_indices) const override
	Split data according to threshold, if bin <= threshold, will put into left(lte_indices), else put into right(gt_indices)

data_size_t	num_data () const override
	Number of all data.

OrderedBin *	CreateOrderedBin () const override
	not ordered bin for dense feature

void	FinishLoad () override
	After pushed all feature data, call this could have better refactor for bin data.

void	LoadFromMemory (const void *memory, const std::vector< data_size_t > &local_used_indices) override
	Load from memory.

void	CopySubset (const Bin full_bin, const data_size_t used_indices, data_size_t num_used_indices) override

void	SaveBinaryToFile (const VirtualFileWriter *writer) const override
	Save binary data to file.

size_t	SizesInByte () const override
	Get sizes in byte of this object.

Public Member Functions inherited from LightGBM::Bin
virtual	~Bin ()
	virtual destructor

Data Fields
friend	DenseBinIterator< VAL_T >

Protected Attributes
data_size_t	num_data_

std::vector< VAL_T >	data_

Additional Inherited Members
Static Public Member Functions inherited from LightGBM::Bin
static Bin *	CreateBin (data_size_t num_data, int num_bin, double sparse_rate, bool is_enable_sparse, double sparse_threshold, bool *is_sparse)
	Create object for bin data of one feature, will call CreateDenseBin or CreateSparseBin according to "is_sparse".

static Bin *	CreateDenseBin (data_size_t num_data, int num_bin)
	Create object for bin data of one feature, used for dense feature.

static Bin *	CreateSparseBin (data_size_t num_data, int num_bin)
	Create object for bin data of one feature, used for sparse feature.

Detailed Description

template<typename VAL_T>
class LightGBM::DenseBin< VAL_T >

Used to store bins for dense feature Use template to reduce memory cost.

Member Function Documentation

◆ ConstructHistogram() [1/4]

template<typename VAL_T >

void LightGBM::DenseBin< VAL_T >::ConstructHistogram	(	const data_size_t *	data_indices,
		data_size_t	num_data,
		const score_t *	ordered_gradients,
		const score_t *	ordered_hessians,
		HistogramBinEntry *	out
	)		const

inlineoverridevirtual

Construct histogram of this feature, Note: We use ordered_gradients and ordered_hessians to improve cache hit chance The naive solution is using gradients[data_indices[i]] for data_indices[i] to get gradients, which is not cache friendly, since the access of memory is not continuous. ordered_gradients and ordered_hessians are preprocessed, and they are re-ordered by data_indices. Ordered_gradients[i] is aligned with data_indices[i]'s gradients (same for ordered_hessians).

Parameters

data_indices	Used data indices in current leaf
num_data	Number of used data
ordered_gradients	Pointer to gradients, the data_indices[i]-th data's gradient is ordered_gradients[i]
ordered_hessians	Pointer to hessians, the data_indices[i]-th data's hessian is ordered_hessians[i]
out	Output Result

min_bin	min_bin of current used feature
max_bin	max_bin of current used feature
default_bin	default bin if bin not in [min_bin, max_bin]

memory
local_used_indices

idx	Index of record
value	bin value of record

Public Member Functions

Data Fields

Protected Attributes

Additional Inherited Members

Detailed Description

Member Function Documentation

◆ ConstructHistogram() [1/4]

◆ ConstructHistogram() [2/4]

◆ ConstructHistogram() [3/4]

◆ ConstructHistogram() [4/4]

◆ CopySubset()

◆ CreateOrderedBin()

◆ FinishLoad()

◆ GetIterator()

◆ LoadFromMemory()

◆ num_data()

◆ Push()

◆ ReSize()

◆ SaveBinaryToFile()

◆ SizesInByte()

◆ Split()

◆ SplitCategorical()